firstmodtestdata = lm(Price~LotArea+YearBuilt+BasementSF+FullBath+TotalRooms+GarageCars+WoodDeckSF, data=AmesTest6)
summary(firstmodtestdata)
Call:
lm(formula = Price ~ LotArea + YearBuilt + BasementSF + FullBath +
TotalRooms + GarageCars + WoodDeckSF, data = AmesTest6)
Residuals:
Min 1Q Median 3Q Max
-89.40 -22.42 -3.33 18.21 205.84
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.094e+03 2.534e+02 -4.318 2.52e-05 ***
LotArea -5.590e-06 2.603e-04 -0.021 0.983
YearBuilt 5.382e-01 1.313e-01 4.100 6.10e-05 ***
BasementSF 5.516e-02 8.507e-03 6.484 7.36e-10 ***
FullBath 7.526e-01 7.662e+00 0.098 0.922
TotalRooms 1.575e+01 2.512e+00 6.271 2.32e-09 ***
GarageCars 3.032e+01 4.683e+00 6.475 7.73e-10 ***
WoodDeckSF 2.667e-02 2.237e-02 1.192 0.235
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 39.9 on 192 degrees of freedom
Multiple R-squared: 0.7171, Adjusted R-squared: 0.7068
F-statistic: 69.52 on 7 and 192 DF, p-value: < 2.2e-16
predict(firstmodtestdata, data.frame(AmesTest6), level = .95, interval = "predict")
fit lwr upr
1 119.79432 40.5152173 199.0734
2 185.20921 103.0373915 267.3810
3 275.60888 195.7206281 355.4971
4 90.56750 10.3889986 170.7460
5 86.96406 4.8518632 169.0763
6 126.08187 46.7575130 205.4062
7 200.50411 120.4898033 280.5184
8 237.01907 157.4596418 316.5785
9 114.82438 35.5197614 194.1290
10 144.15482 64.4808637 223.8288
11 202.37219 120.6466901 284.0977
12 198.95260 119.5073488 278.3979
13 116.22655 34.9234336 197.5297
14 138.02012 57.7955372 218.2447
15 181.44156 101.4030738 261.4800
16 282.13434 202.4496567 361.8190
17 151.45066 71.9408564 230.9605
18 176.13772 96.6583594 255.6171
19 109.66909 30.2957665 189.0424
20 128.11028 48.7517242 207.4688
21 168.01417 87.9831435 248.0452
22 188.75791 108.8267372 268.6891
23 219.59739 140.2343007 298.9605
24 108.84314 29.1979588 188.4883
25 195.09081 115.5202960 274.6613
26 84.07868 4.5933893 163.5640
27 236.05029 156.6282877 315.4723
28 176.68544 97.2296266 256.1413
29 140.10917 60.5007887 219.7175
30 208.59573 129.2094646 287.9820
31 177.71860 93.4542332 261.9830
32 80.94277 -0.2203661 162.1059
33 248.78819 168.9840392 328.5923
34 177.56908 98.1220212 257.0161
35 229.49614 149.6695764 309.3227
36 235.50800 155.9280118 315.0880
37 203.26072 122.7924460 283.7290
38 268.34494 188.4888109 348.2011
39 138.62004 58.9141084 218.3260
40 185.65981 104.8518484 266.4678
41 168.23074 88.1750269 248.2865
42 55.81711 -24.3816553 136.0159
43 137.77802 57.7983272 217.7577
44 218.75158 135.8379209 301.6652
45 180.67127 101.1920290 260.1505
46 210.65405 130.4145386 290.8936
47 123.85706 43.6238691 204.0902
48 163.09484 83.4100980 242.7796
49 275.87649 195.9837153 355.7693
50 346.16506 264.4975692 427.8325
51 186.81834 107.2550225 266.3817
52 137.78640 57.0075978 218.5652
53 163.21033 83.4026336 243.0180
54 128.77159 49.3485348 208.1946
55 188.38632 106.8170245 269.9556
56 195.76089 115.9210747 275.6007
57 211.43980 131.9600581 290.9195
58 210.50282 130.8053147 290.2003
59 287.28702 207.4572322 367.1168
60 144.76171 64.8109114 224.7125
61 133.96963 54.4388225 213.5004
62 134.51164 54.9399192 214.0834
63 170.38406 90.5431180 250.2250
64 204.07838 124.4811273 283.6756
65 305.27627 225.4216817 385.1309
66 127.66096 48.0601917 207.2617
67 143.85761 64.4458487 223.2694
68 102.21825 22.4604426 181.9761
69 105.71369 26.3566438 185.0707
70 274.08487 192.8703984 355.2994
71 308.75013 227.8042068 389.6961
72 126.62414 47.0396355 206.2087
73 208.16756 128.1465377 288.1886
74 235.77503 156.3601701 315.1899
75 244.35647 164.9858838 323.7270
76 167.38927 87.9390722 246.8395
77 294.23159 213.9051280 374.5581
78 213.69345 133.8298219 293.5571
79 172.65664 92.9945005 252.3188
80 275.71954 192.7898831 358.6492
81 198.45265 115.2970060 281.6083
82 67.62563 -12.2175045 147.4688
83 129.18691 49.7081016 208.6657
84 226.65695 147.1919454 306.1220
85 201.42185 121.7322553 281.1114
86 210.20057 129.6588632 290.7423
87 245.28714 163.4744257 327.0999
88 95.52922 13.8945832 177.1639
89 141.51753 60.5590121 222.4760
90 178.89335 98.1294762 259.6572
91 145.49248 65.5823301 225.4026
92 162.67735 83.2308200 242.1239
93 104.77006 25.5319681 184.0081
94 78.00059 -1.5257067 157.5269
95 129.13558 49.6389468 208.6322
96 240.26617 160.9250730 319.6073
97 229.09864 149.2967966 308.9005
98 221.82081 142.2596961 301.3819
99 148.68380 69.1709076 228.1967
100 159.37320 79.8857695 238.8606
101 161.69776 80.6986976 242.6968
102 341.51848 260.2988486 422.7381
103 272.02799 191.2361126 352.8199
104 104.05197 24.3009462 183.8030
105 47.10138 -33.2845201 127.4873
106 216.67017 137.2407068 296.0996
107 263.24188 183.2905598 343.1932
108 119.06606 39.8509238 198.2812
109 291.59252 211.3661846 371.8189
110 105.04425 25.4875235 184.6010
111 158.37215 79.0431348 237.7012
112 76.18013 -3.7819155 156.1422
113 161.49288 80.4956209 242.4901
114 32.40084 -48.5150062 113.3167
115 94.29581 14.8171069 173.7745
116 197.12729 116.8510574 277.4035
117 165.73389 86.2411214 245.2267
118 100.31488 20.8076103 179.8221
119 107.84266 27.6557645 188.0296
120 247.50975 166.0795033 328.9400
121 36.82383 -44.2753782 117.9230
122 121.88925 41.9830717 201.7954
123 157.26566 77.7658404 236.7655
124 156.46568 75.9871461 236.9442
125 297.35013 216.0184503 378.6818
126 189.70275 109.5511470 269.8543
127 133.06156 52.9207362 213.2024
128 152.17738 72.7469421 231.6078
129 56.70398 -23.2363804 136.6443
130 129.83372 50.2603842 209.4071
131 129.36215 50.0047702 208.7195
132 145.45627 62.6684507 228.2441
133 192.60650 112.1461309 273.0669
134 72.67559 -7.6281967 152.9794
135 203.42963 124.0768193 282.7824
136 152.56893 73.1675271 231.9703
137 187.51218 107.3599896 267.6644
138 218.48547 138.3508767 298.6201
139 142.50647 61.4328906 223.5800
140 120.00407 38.3054306 201.7027
141 36.66320 -43.6243616 116.9508
142 292.34015 212.2969048 372.3834
143 203.93579 124.3287521 283.5428
144 202.21936 122.3564080 282.0823
145 140.19638 60.7979408 219.5948
146 173.89491 92.0095440 255.7803
147 103.12039 22.6092085 183.6316
148 144.99724 65.1837300 224.8108
149 213.96240 134.5627470 293.3621
150 193.89500 113.8068680 273.9831
151 142.86456 62.3486648 223.3805
152 123.00017 43.6163834 202.3840
153 240.30665 160.6690628 319.9442
154 107.84864 28.5611069 187.1362
155 166.90726 87.1000131 246.7145
156 235.32704 126.9036743 343.7504
157 215.00546 135.3627276 294.6482
158 200.81353 121.2973076 280.3298
159 136.85179 57.1292437 216.5743
160 103.14306 22.1624180 184.1237
161 274.79477 194.8895644 354.7000
162 201.41444 121.9618817 280.8670
163 231.26161 151.6439835 310.8792
164 229.01444 148.6243064 309.4046
165 180.00592 98.3362443 261.6756
166 106.26796 26.8057601 185.7302
167 177.63164 97.8459479 257.4173
168 193.92888 114.3845005 273.4733
169 210.61053 130.7533819 290.4677
170 206.93953 127.3798027 286.4993
171 133.26973 53.3533513 213.1861
172 201.90450 122.2733228 281.5357
173 150.15794 70.6435634 229.6723
174 137.96084 58.6270643 217.2946
175 111.82222 30.5850327 193.0594
176 176.50802 96.9442410 256.0718
177 277.33212 197.1893522 357.4749
178 115.51149 35.5933913 195.4296
179 144.23473 64.8254012 223.6440
180 267.63775 187.9786791 347.2968
181 151.01391 71.3849998 230.6428
182 131.01004 49.8363080 212.1838
183 101.86943 21.7993302 181.9395
184 112.88288 33.6139060 192.1518
185 210.44441 130.7608902 290.1279
186 219.43522 138.7134067 300.1570
187 147.69957 66.4253418 228.9738
188 349.31491 267.8233528 430.8065
189 152.70685 73.2617209 232.1520
190 153.43281 73.9968210 232.8688
191 203.30477 124.0193554 282.5902
192 262.86770 183.0208294 342.7146
193 140.04478 60.3634818 219.7261
194 249.00854 169.5194115 328.4977
195 174.00818 93.7265263 254.2898
196 202.52187 122.0433427 283.0004
197 201.60491 121.9709350 281.2389
198 104.06673 24.8486963 183.2848
199 219.39930 135.8095219 302.9891
200 207.69075 127.4322555 287.9492
rstandard(firstmod)
1 2 3 4 5 6 7 8
0.873588341 -0.924906821 0.447640499 -0.027270222 1.063910057 0.174744457 -0.191277821 0.944573602
9 10 11 12 13 14 15 16
1.126501899 -1.298045855 -1.784521427 -0.390931876 0.928112004 -0.251019846 -0.967433558 -0.853587045
17 18 19 20 21 22 23 24
-0.087371358 -0.409712089 -0.547687199 0.028800886 -0.755026464 0.006164081 0.894683010 0.282996269
25 26 27 28 29 30 31 32
-0.838532129 0.656121242 0.479258899 0.020610155 -1.232385502 -0.217293964 -0.819525973 0.752386583
33 34 35 36 37 38 39 40
1.301730555 -0.368588576 0.012811135 -1.571498698 -1.006890504 0.321895392 0.339657829 1.348589750
41 42 43 44 45 46 47 48
0.963262369 -0.238079154 0.475724235 2.200365674 -0.105575016 1.926363924 0.042001460 0.073728503
49 50 51 52 53 54 55 56
0.285366801 5.367860526 0.004602895 -0.535364029 -0.348513231 -0.006868886 -1.351224074 0.730852565
57 58 59 60 61 62 63 64
-1.048849353 -2.043381957 -0.312424719 -0.531309368 -0.334815399 -0.114328389 -0.276792206 -0.230129988
65 66 67 68 69 70 71 72
0.679733427 0.109996480 -0.454107013 -0.058890478 -0.018034749 0.723333552 1.941805102 -0.421338773
73 74 75 76 77 78 79 80
-1.253381349 -0.272486984 0.774488798 -0.566457751 2.784140822 -0.602728826 0.211678634 -1.994937108
81 82 83 84 85 86 87 88
-2.024610829 0.543586839 -0.105970303 -1.233809829 -0.462472098 2.511469835 -0.085901980 0.325066427
89 90 91 92 93 94 95 96
-0.555573111 -0.177505261 0.494006202 -0.573718685 0.384261205 0.272235494 -0.016090265 -0.309897886
97 98 99 100 101 102 103 104
-0.396482995 -1.236982818 -0.169241707 -0.085384939 -0.043860188 2.861695728 -1.494809388 -0.102922941
105 106 107 108 109 110 111 112
0.817165306 -1.180707634 -0.795678800 0.338846184 1.157144516 -0.077128355 -1.158751459 0.963346888
113 114 115 116 117 118 119 120
1.059355672 1.483725472 0.777118380 -0.566006929 -1.410876007 0.422455873 0.233959133 0.480574898
121 122 123 124 125 126 127 128
1.453286050 0.460973322 -0.057359311 1.116648142 -0.125883204 -0.761063412 -0.346267144 -0.702597768
129 130 131 132 133 134 135 136
0.707814946 1.144570875 0.091927590 -0.144639660 -1.349014738 1.338935281 -1.021587372 -0.697060989
137 138 139 140 141 142 143 144
0.123542811 -0.880443400 0.180859063 0.208614507 0.699371211 0.514068479 -0.581479506 -0.895924381
145 146 147 148 149 150 151 152
-0.004965056 -1.436055172 0.073894261 0.114467801 0.089443594 0.717097685 -0.163333065 -0.202232289
153 154 155 156 157 158 159 160
-0.667207525 0.130055513 0.587008268 -0.499410733 0.161432770 -0.833445933 0.079936551 0.022132570
161 162 163 164 165 166 167 168
0.410965109 -0.668316920 1.032963767 -0.999512930 -1.043325588 1.081307876 0.111010308 -0.102058043
169 170 171 172 173 174 175 176
-0.524258481 3.498002418 0.725007470 -0.783755051 -0.155929312 -0.087427781 -0.474914350 -0.240915045
177 178 179 180 181 182 183 184
-1.259629358 -0.906564109 0.014294054 -1.208817271 -0.570948165 -0.440510642 0.232909695 1.321785621
185 186 187 188 189 190 191 192
-0.239680635 3.594317257 -0.225611359 0.798219974 -0.650350729 -0.972185513 -1.093278608 -0.454428554
193 194 195 196 197 198 199 200
-0.511220334 0.341513434 0.856769820 -1.603675529 0.086104606 0.442260731 -2.398959566 -0.375693285
rstudent(firstmod)
1 2 3 4 5 6 7 8
0.873047211 -0.924557038 0.446706411 -0.027199166 1.064277615 0.174302661 -0.190797232 0.944307205
9 10 11 12 13 14 15 16
1.127296027 -1.300379520 -1.794814826 -0.390067769 0.927775423 -0.250406388 -0.967271333 -0.852981266
17 18 19 20 21 22 23 24
-0.087145264 -0.408822492 -0.546686279 0.028725848 -0.754178125 0.006148009 0.894216028 0.282317223
25 26 27 28 29 30 31 32
-0.837881239 0.655145245 0.478295376 0.020556435 -1.234062556 -0.216754010 -0.818822394 0.751533399
33 34 35 36 37 38 39 40
1.304103665 -0.367757591 0.012777735 -1.577579590 -1.006926956 0.321142696 0.338873974 1.351489344
41 42 43 44 45 46 47 48
0.963080547 -0.237493405 0.474763641 2.222833296 -0.105302779 1.940181703 0.041892131 0.073537292
49 50 51 52 53 54 55 56
0.284683067 5.807328060 0.004590893 -0.534367028 -0.347714461 -0.006850976 -1.354154653 0.729962907
57 58 59 60 61 62 63 64
-1.049124262 -2.060582528 -0.311689289 -0.530313937 -0.334039876 -0.114034152 -0.276125549 -0.229561571
65 66 67 68 69 70 71 72
0.678778190 0.109713114 -0.453166320 -0.058737448 -0.017987737 0.722432419 1.956043914 -0.420434520
73 74 75 76 77 78 79 80
-1.255258954 -0.271829022 0.773678749 -0.565453372 2.834691691 -0.601726702 0.211151308 -2.010683006
81 82 83 84 85 86 87 88
-2.041238495 0.542587080 -0.105697069 -1.235500214 -0.461523302 2.547108503 -0.085679632 0.324308049
89 90 91 92 93 94 95 96
-0.554570364 -0.177056933 0.493031482 -0.572713799 0.383406675 0.271578041 -0.016048319 -0.309167137
97 98 99 100 101 102 103 104
-0.395611124 -1.238703042 -0.168812991 -0.085163909 -0.043746039 2.917123039 -1.499663493 -0.102657395
105 106 107 108 109 110 111 112
0.816455510 -1.181927538 -0.794915682 0.338063717 1.158172724 -0.076928429 -1.159792414 0.963165460
113 114 115 116 117 118 119 120
1.059694814 1.488414119 0.776313855 -0.565002596 -1.414548867 0.421550258 0.233382339 0.479610310
121 122 123 124 125 126 127 128
1.457535276 0.460025941 -0.057210233 1.117370580 -0.125560137 -0.760226465 -0.345472118 -0.701668294
129 130 131 132 133 134 135 136
0.706892151 1.145500975 0.091689901 -0.144270363 -1.351919320 1.341722593 -1.021704101 -0.696124755
137 138 139 140 141 142 143 144
0.123225563 -0.879925687 0.180402829 0.208094116 0.698437757 0.513081232 -0.580474602 -0.895461956
145 146 147 148 149 150 151 152
-0.004952110 -1.440065247 0.073702624 0.114173214 0.089212223 0.716187523 -0.162918482 -0.201726441
153 154 155 156 157 158 159 160
-0.666240549 0.129722099 0.586003686 -0.498432327 0.161022751 -0.832780481 0.079729438 0.022074886
161 162 163 164 165 166 167 168
0.410073891 -0.667350917 1.033145027 -0.999510381 -1.043567462 1.081787216 0.110724395 -0.101794681
169 170 171 172 173 174 175 176
-0.523266103 3.605665651 0.724108836 -0.782964844 -0.155532564 -0.087201543 -0.473954439 -0.240323170
177 178 179 180 181 182 183 184
-1.261568334 -0.906141639 0.014256789 -1.210279485 -0.569943417 -0.439584176 0.232335191 1.324378384
185 186 187 188 189 190 191 192
-0.239091423 3.712004359 -0.225052896 0.797462859 -0.649370539 -0.972045937 -1.093837863 -0.453487541
193 194 195 196 197 198 199 200
-0.510234672 0.340726419 0.856173966 -1.610315010 0.085881741 0.441332362 -2.429390387 -0.374851448
plot(firstmodtestdata$residuals~firstmodtestdata$fitted.values)
abline(a=0, b=0)
mean(firstmodtestdata$residuals)
[1] -2.966551e-16
sd(firstmodtestdata$residuals)
[1] 39.19671
hist(firstmodtestdata$residuals)
firstmodtraindata = lm(Price~LotArea+YearBuilt+BasementSF+FullBath+TotalRooms+GarageCars+WoodDeckSF, data=AmesTrain6a)
mean(firstmodtraindata$residuals)
[1] -1.342822e-16
sd(firstmodtraindata$residuals)
[1] 39.27828
The mean residuals of our simplest model using the test data are very similar to the mean residuals we had with the training data. This is a good sign that our model was not overfitted. Additionally, the standard deviations were similar, which was expected.
which.max(rstandard(firstmodtestdata))
50
50
max(rstandard(firstmodtestdata))
[1] 5.367861
which.max(rstudent(firstmodtestdata))
50
50
max(rstudent(firstmodtestdata))
[1] 5.807328
which.min(rstandard(firstmodtestdata))
199
199
min(rstandard(firstmodtestdata))
[1] -2.39896
which.min(rstudent(firstmodtestdata))
199
199
min(rstudent(firstmodtestdata))
[1] -2.42939
Although some of these outliers are over the threshold for concern for rstudent and rstandard residuals, they are not substantially different from the outliers we had in our first model. This means that overfitting is likely not the cause of the outliers of concern. Finally, the outliers don’t change substantially between rstandard and rstudent, meaning that their removal from the data set doesn’t significantly alter our model.
summary(firstmodtraindata)
Call:
lm(formula = Price ~ LotArea + YearBuilt + BasementSF + FullBath +
TotalRooms + GarageCars + WoodDeckSF, data = AmesTrain6a)
Residuals:
Min 1Q Median 3Q Max
-143.676 -22.289 -3.192 16.325 223.801
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.418e+03 1.390e+02 -10.197 < 2e-16 ***
LotArea 1.753e-03 4.293e-04 4.084 5.04e-05 ***
YearBuilt 7.127e-01 7.154e-02 9.962 < 2e-16 ***
BasementSF 4.947e-02 4.610e-03 10.731 < 2e-16 ***
FullBath 8.294e+00 4.097e+00 2.025 0.0434 *
TotalRooms 1.091e+01 1.412e+00 7.723 4.92e-14 ***
GarageCars 1.810e+01 2.885e+00 6.274 6.85e-10 ***
WoodDeckSF 6.867e-02 1.453e-02 4.725 2.88e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 39.51 on 587 degrees of freedom
Multiple R-squared: 0.6833, Adjusted R-squared: 0.6795
F-statistic: 180.9 on 7 and 587 DF, p-value: < 2.2e-16
fitAmes=predict(firstmodtraindata, newdata=AmesTest6)
holoutresid=(AmesTest6$Price)-fitAmes
mean(holdoutresid)
[1] 0.01494272
cor(AmesTest6$Price, fitAmes)
[1] 0.7888105
crosscorr=cor(AmesTest6$Price, fitAmes)
crosscorr^2
[1] 0.622222
0.6795-crosscorr^2
[1] 0.05727804
Our shrinkage is 0.05727804, which indicates that our model fits our test data almost as well as it fitr our training data. This means that we did not overfit our model (which makes sense, because this was the most basic model that we used).
modTransformCat=lm(Price~factor(HouseStyle)+factor(ExteriorQ)+factor(BasementFin)+factor(HeatingQC)+factor(KitchenQ)+factor(ExteriorC)+factor(CentralAir)+factor(GarageQ)+factor(Foundation)+factor(GarageC)+factor(BasementHt)+factor(GarageType)+factor(LotConfig)+factor(BasementC)+factor(Heating)+factor(Condition), data=AmesTrain6a)
summary(modTransformCat)
Call:
lm(formula = Price ~ factor(HouseStyle) + factor(ExteriorQ) +
factor(BasementFin) + factor(HeatingQC) + factor(KitchenQ) +
factor(ExteriorC) + factor(CentralAir) + factor(GarageQ) +
factor(Foundation) + factor(GarageC) + factor(BasementHt) +
factor(GarageType) + factor(LotConfig) + factor(BasementC) +
factor(Heating) + factor(Condition), data = AmesTrain6a)
Residuals:
Min 1Q Median 3Q Max
-96.316 -23.460 -0.895 18.138 154.613
Coefficients: (4 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 276.8311 87.8232 3.152 0.001712 **
factor(HouseStyle)1.5Unf -4.5273 20.4211 -0.222 0.824633
factor(HouseStyle)1Story -0.0779 6.5346 -0.012 0.990493
factor(HouseStyle)2.5Unf 42.7315 23.8827 1.789 0.074150 .
factor(HouseStyle)2Story 11.6310 7.0351 1.653 0.098862 .
factor(HouseStyle)SFoyer -24.9542 12.1385 -2.056 0.040291 *
factor(HouseStyle)SLvl -7.8712 9.2634 -0.850 0.395873
factor(ExteriorQ)Fa -127.2075 23.2606 -5.469 6.99e-08 ***
factor(ExteriorQ)Gd -62.8435 13.3102 -4.721 3.00e-06 ***
factor(ExteriorQ)TA -94.3985 14.1932 -6.651 7.27e-11 ***
factor(BasementFin)BLQ 0.8396 7.0710 0.119 0.905529
factor(BasementFin)GLQ 3.5429 5.5164 0.642 0.520991
factor(BasementFin)LwQ 1.7849 8.8851 0.201 0.840869
factor(BasementFin)None -110.4109 42.1347 -2.620 0.009034 **
factor(BasementFin)Rec 1.2895 7.3736 0.175 0.861240
factor(BasementFin)Unf -6.0051 5.3489 -1.123 0.262085
factor(HeatingQC)Fa -0.6798 11.2329 -0.061 0.951765
factor(HeatingQC)Gd 4.0802 5.3941 0.756 0.449737
factor(HeatingQC)TA -1.1773 4.9969 -0.236 0.813823
factor(KitchenQ)Fa -52.7203 15.1922 -3.470 0.000562 ***
factor(KitchenQ)Gd -34.8167 9.5749 -3.636 0.000304 ***
factor(KitchenQ)TA -40.5129 10.1530 -3.990 7.53e-05 ***
factor(ExteriorC)Fa 4.3272 31.9069 0.136 0.892173
factor(ExteriorC)Gd -25.8265 26.4996 -0.975 0.330203
factor(ExteriorC)TA -19.6775 27.3093 -0.721 0.471509
factor(CentralAir)Y 5.2874 8.6428 0.612 0.540953
factor(GarageQ)Gd 61.5677 27.1955 2.264 0.023984 *
factor(GarageQ)None 46.7622 50.8032 0.920 0.357753
factor(GarageQ)Po 49.5994 70.6183 0.702 0.482764
factor(GarageQ)TA 0.1246 10.8491 0.011 0.990844
factor(Foundation)CBlock 1.7521 7.6006 0.231 0.817772
factor(Foundation)PConc 11.4451 8.6758 1.319 0.187676
factor(Foundation)Slab 12.2294 22.4476 0.545 0.586120
factor(Foundation)Stone 18.3872 24.4244 0.753 0.451891
factor(Foundation)Wood 24.3737 29.0463 0.839 0.401774
factor(GarageC)Fa 85.3064 48.2085 1.770 0.077380 .
factor(GarageC)Gd 98.6830 50.8363 1.941 0.052765 .
factor(GarageC)None NA NA NA NA
factor(GarageC)Po 88.2301 55.9473 1.577 0.115387
factor(GarageC)TA 98.0079 46.5230 2.107 0.035617 *
factor(BasementHt)Fa -89.7863 14.3966 -6.237 9.14e-10 ***
factor(BasementHt)Gd -58.4549 8.2455 -7.089 4.33e-12 ***
factor(BasementHt)None NA NA NA NA
factor(BasementHt)TA -72.2207 10.0079 -7.216 1.86e-12 ***
factor(GarageType)Attchd -17.1134 18.0483 -0.948 0.343460
factor(GarageType)Basment -32.2733 23.4157 -1.378 0.168700
factor(GarageType)BuiltIn -5.6891 19.0984 -0.298 0.765907
factor(GarageType)CarPort -58.3085 45.1565 -1.291 0.197178
factor(GarageType)Detchd -46.0371 18.1132 -2.542 0.011317 *
factor(GarageType)None NA NA NA NA
factor(LotConfig)CulDSac 7.4004 7.6958 0.962 0.336682
factor(LotConfig)FR2 -20.1713 8.8796 -2.272 0.023509 *
factor(LotConfig)FR3 10.9377 20.5572 0.532 0.594905
factor(LotConfig)Inside -7.3204 4.6136 -1.587 0.113172
factor(BasementC)Fa -7.9727 39.4513 -0.202 0.839924
factor(BasementC)Gd -1.8171 39.2841 -0.046 0.963124
factor(BasementC)None NA NA NA NA
factor(BasementC)TA -18.5301 38.3507 -0.483 0.629172
factor(Heating)GasW 24.5236 17.0449 1.439 0.150806
factor(Heating)Grav -43.9947 45.3212 -0.971 0.332125
factor(Heating)OthW -1.8212 41.8501 -0.044 0.965306
factor(Heating)Wall 8.0581 32.3359 0.249 0.803302
factor(Condition)3 0.4042 51.1716 0.008 0.993701
factor(Condition)4 22.0054 50.2147 0.438 0.661400
factor(Condition)5 35.7435 50.0151 0.715 0.475137
factor(Condition)6 33.5797 50.1292 0.670 0.503237
factor(Condition)7 36.8567 50.2968 0.733 0.464014
factor(Condition)8 44.5512 50.5200 0.882 0.378256
factor(Condition)9 49.7814 52.3594 0.951 0.342158
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 37.57 on 530 degrees of freedom
Multiple R-squared: 0.7414, Adjusted R-squared: 0.7102
F-statistic: 23.75 on 64 and 530 DF, p-value: < 2.2e-16
MSE=(summary(modTransformCat)$sigma)^2
step(none,scope=list(upper=modTransformCat),scale=MSE)
Start: AIC=1456.74
Price ~ 1
Df Sum of Sq RSS Cp
+ factor(ExteriorQ) 3 1606904 1286545 324.40
+ factor(BasementHt) 4 1555913 1337536 362.52
+ factor(KitchenQ) 3 1305178 1588271 538.14
+ factor(Foundation) 5 973531 1919917 777.09
+ factor(GarageType) 6 737589 2155859 946.23
+ factor(HeatingQC) 3 635156 2258293 1012.79
+ factor(Condition) 7 546078 2347371 1083.90
+ factor(BasementFin) 6 515220 2378228 1103.76
+ factor(HouseStyle) 6 230783 2662666 1305.26
+ factor(GarageC) 5 220424 2673025 1310.59
+ factor(GarageQ) 4 213737 2679712 1313.33
+ factor(CentralAir) 1 171099 2722350 1337.54
+ factor(BasementC) 4 116360 2777088 1382.31
+ factor(ExteriorC) 3 73758 2819691 1410.49
+ factor(LotConfig) 4 40148 2853301 1436.30
+ factor(Heating) 4 31951 2861498 1442.11
<none> 2893449 1456.74
Step: AIC=324.4
Price ~ factor(ExteriorQ)
Df Sum of Sq RSS Cp
+ factor(BasementHt) 4 250784 1035761 154.74
+ factor(GarageType) 6 213599 1072946 185.08
+ factor(Foundation) 5 127908 1158637 243.79
+ factor(KitchenQ) 3 111799 1174746 251.20
+ factor(GarageC) 5 61878 1224668 290.57
+ factor(CentralAir) 1 49963 1236582 291.01
+ factor(GarageQ) 4 51506 1235039 295.91
+ factor(Condition) 7 56831 1229715 298.14
+ factor(HouseStyle) 6 51999 1234546 299.56
+ factor(BasementFin) 6 50313 1236232 300.76
+ factor(HeatingQC) 3 34665 1251880 305.84
+ factor(LotConfig) 4 37324 1249222 305.96
+ factor(BasementC) 4 31824 1254721 309.86
+ factor(Heating) 4 14719 1271826 321.97
+ factor(ExteriorC) 3 9120 1277425 323.94
<none> 1286545 324.40
- factor(ExteriorQ) 3 1606904 2893449 1456.74
Step: AIC=154.74
Price ~ factor(ExteriorQ) + factor(BasementHt)
Df Sum of Sq RSS Cp
+ factor(GarageType) 6 124591 911170 78.481
+ factor(KitchenQ) 3 58853 976908 119.050
+ factor(GarageQ) 4 38749 997012 135.292
+ factor(HouseStyle) 6 43956 991805 135.604
+ factor(GarageC) 5 38839 996922 137.228
+ factor(CentralAir) 1 23287 1012474 140.246
+ factor(Foundation) 5 32902 1002859 141.434
+ factor(LotConfig) 4 24518 1011243 145.374
+ factor(HeatingQC) 3 19358 1016403 147.029
+ factor(Condition) 7 29957 1005804 147.521
<none> 1035761 154.742
+ factor(ExteriorC) 3 6387 1029374 156.217
+ factor(Heating) 4 9177 1026584 156.241
+ factor(BasementC) 3 5603 1030159 156.773
+ factor(BasementFin) 5 9762 1025999 157.827
- factor(BasementHt) 4 250784 1286545 324.400
- factor(ExteriorQ) 3 301775 1337536 362.522
Step: AIC=78.48
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType)
Df Sum of Sq RSS Cp
+ factor(KitchenQ) 3 46077 865093 51.839
+ factor(HouseStyle) 6 33520 877650 66.735
+ factor(Condition) 7 30314 880856 71.006
+ factor(LotConfig) 4 17289 893881 74.233
+ factor(GarageQ) 3 13232 897938 75.107
+ factor(BasementC) 3 11289 899881 76.484
+ factor(CentralAir) 1 4507 906662 77.288
+ factor(HeatingQC) 3 9175 901995 77.981
+ factor(ExteriorC) 3 8888 902282 78.184
+ factor(Foundation) 5 14356 896814 78.311
<none> 911170 78.481
+ factor(GarageC) 4 7043 904127 81.491
+ factor(Heating) 4 5425 905744 82.637
+ factor(BasementFin) 5 6296 904874 84.021
- factor(GarageType) 6 124591 1035761 154.742
- factor(BasementHt) 4 161777 1072946 185.085
- factor(ExteriorQ) 3 262738 1173908 258.607
Step: AIC=51.84
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType) +
factor(KitchenQ)
Df Sum of Sq RSS Cp
+ factor(HouseStyle) 6 29516 835576 42.930
+ factor(LotConfig) 4 16523 848569 48.134
+ factor(Condition) 7 20942 844151 51.004
+ factor(BasementC) 3 9422 855671 51.165
<none> 865093 51.839
+ factor(CentralAir) 1 2384 862708 52.150
+ factor(GarageQ) 3 7667 857426 52.408
+ factor(Foundation) 5 11629 853463 53.601
+ factor(ExteriorC) 3 4888 860204 54.376
+ factor(GarageC) 4 7585 857508 54.466
+ factor(HeatingQC) 3 2954 862139 55.747
+ factor(Heating) 4 3609 861484 57.283
+ factor(BasementFin) 5 4771 860322 58.459
- factor(KitchenQ) 3 46077 911170 78.481
- factor(GarageType) 6 111815 976908 119.050
- factor(ExteriorQ) 3 111470 976562 124.805
- factor(BasementHt) 4 131934 997027 137.303
Step: AIC=42.93
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType) +
factor(KitchenQ) + factor(HouseStyle)
Df Sum of Sq RSS Cp
+ factor(LotConfig) 4 19231 816346 37.306
+ factor(BasementC) 3 10072 825504 41.795
+ factor(Condition) 7 21361 814215 41.797
+ factor(CentralAir) 1 3577 832000 42.396
<none> 835576 42.930
+ factor(GarageQ) 3 7821 827756 43.389
+ factor(GarageC) 4 7300 828276 45.758
+ factor(ExteriorC) 3 3909 831667 46.160
+ factor(BasementFin) 5 8833 826743 46.672
+ factor(Foundation) 5 8749 826827 46.732
+ factor(HeatingQC) 3 2709 832867 47.010
+ factor(Heating) 4 2844 832732 48.915
- factor(HouseStyle) 6 29516 865093 51.839
- factor(KitchenQ) 3 42073 877650 66.735
- factor(GarageType) 6 103598 939174 104.319
- factor(ExteriorQ) 3 103129 938705 109.987
- factor(BasementHt) 4 135911 971487 131.210
Step: AIC=37.31
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType) +
factor(KitchenQ) + factor(HouseStyle) + factor(LotConfig)
Df Sum of Sq RSS Cp
+ factor(Condition) 7 21085 795261 36.370
+ factor(BasementC) 3 9549 806796 36.542
+ factor(CentralAir) 1 3152 813193 37.073
<none> 816346 37.306
+ factor(GarageQ) 3 5790 810556 39.205
+ factor(BasementFin) 5 9685 806661 40.446
+ factor(ExteriorC) 3 3928 812418 40.524
+ factor(Foundation) 5 9138 807207 40.833
+ factor(GarageC) 4 6012 810334 41.048
+ factor(HeatingQC) 3 2690 813656 41.401
- factor(LotConfig) 4 19231 835576 42.930
+ factor(Heating) 4 2312 814033 43.669
- factor(HouseStyle) 6 32224 848569 48.134
- factor(KitchenQ) 3 41042 857388 60.381
- factor(GarageType) 6 98041 914387 94.760
- factor(ExteriorQ) 3 107192 923538 107.243
- factor(BasementHt) 4 128373 944719 120.247
Step: AIC=36.37
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType) +
factor(KitchenQ) + factor(HouseStyle) + factor(LotConfig) +
factor(Condition)
Df Sum of Sq RSS Cp
<none> 795261 36.370
+ factor(BasementC) 3 7759 787502 36.873
- factor(Condition) 7 21085 816346 37.306
+ factor(CentralAir) 1 825 794436 37.786
+ factor(GarageQ) 3 5350 789911 38.580
+ factor(Foundation) 5 10768 784493 38.742
+ factor(ExteriorC) 3 3198 792063 40.104
+ factor(BasementFin) 5 8249 787012 40.526
+ factor(HeatingQC) 3 1677 793584 41.182
+ factor(GarageC) 4 3844 791417 41.647
- factor(LotConfig) 4 18954 814215 41.797
+ factor(Heating) 4 2858 792403 42.345
- factor(HouseStyle) 6 32660 827921 47.506
- factor(KitchenQ) 3 33199 828460 53.889
- factor(GarageType) 6 98676 893937 94.273
- factor(ExteriorQ) 3 97091 892352 99.150
- factor(BasementHt) 4 126197 921458 117.769
Call:
lm(formula = Price ~ factor(ExteriorQ) + factor(BasementHt) +
factor(GarageType) + factor(KitchenQ) + factor(HouseStyle) +
factor(LotConfig) + factor(Condition), data = AmesTrain6a)
Coefficients:
(Intercept) factor(ExteriorQ)Fa factor(ExteriorQ)Gd factor(ExteriorQ)TA
371.587 -127.471 -64.446 -97.138
factor(BasementHt)Fa factor(BasementHt)Gd factor(BasementHt)None factor(BasementHt)TA
-101.217 -61.932 -93.392 -80.750
factor(GarageType)Attchd factor(GarageType)Basment factor(GarageType)BuiltIn factor(GarageType)CarPort
-15.095 -33.555 -7.098 -67.326
factor(GarageType)Detchd factor(GarageType)None factor(KitchenQ)Fa factor(KitchenQ)Gd
-46.666 -47.290 -55.587 -34.705
factor(KitchenQ)TA factor(HouseStyle)1.5Unf factor(HouseStyle)1Story factor(HouseStyle)2.5Unf
-43.160 -12.327 -2.912 32.865
factor(HouseStyle)2Story factor(HouseStyle)SFoyer factor(HouseStyle)SLvl factor(LotConfig)CulDSac
9.627 -28.005 -8.037 9.395
factor(LotConfig)FR2 factor(LotConfig)FR3 factor(LotConfig)Inside factor(Condition)3
-20.856 12.616 -7.700 -6.741
factor(Condition)4 factor(Condition)5 factor(Condition)6 factor(Condition)7
7.656 22.628 20.582 24.883
factor(Condition)8 factor(Condition)9
32.250 41.650
modCatTransformForward=lm(Price~factor(HouseStyle)+factor(ExteriorQ)+factor(BasementFin)+factor(HeatingQC)+factor(KitchenQ)+factor(ExteriorC)+factor(CentralAir)+factor(GarageQ)+factor(Foundation)+factor(GarageC)+factor(BasementHt)+factor(GarageType)+factor(LotConfig)+factor(BasementC)+factor(Heating)+factor(Condition), data=AmesTrain6a)
MSE=(summary(modCatTransformForward)$sigma)^2
none=lm(Price~1,data=AmesTrain6a)
step(none,scope=list(upper=modCatTransformForward),scale=MSE, direction = "forward")
Start: AIC=1456.74
Price ~ 1
Df Sum of Sq RSS Cp
+ factor(ExteriorQ) 3 1606904 1286545 324.40
+ factor(BasementHt) 4 1555913 1337536 362.52
+ factor(KitchenQ) 3 1305178 1588271 538.14
+ factor(Foundation) 5 973531 1919917 777.09
+ factor(GarageType) 6 737589 2155859 946.23
+ factor(HeatingQC) 3 635156 2258293 1012.79
+ factor(Condition) 7 546078 2347371 1083.90
+ factor(BasementFin) 6 515220 2378228 1103.76
+ factor(HouseStyle) 6 230783 2662666 1305.26
+ factor(GarageC) 5 220424 2673025 1310.59
+ factor(GarageQ) 4 213737 2679712 1313.33
+ factor(CentralAir) 1 171099 2722350 1337.54
+ factor(BasementC) 4 116360 2777088 1382.31
+ factor(ExteriorC) 3 73758 2819691 1410.49
+ factor(LotConfig) 4 40148 2853301 1436.30
+ factor(Heating) 4 31951 2861498 1442.11
<none> 2893449 1456.74
Step: AIC=324.4
Price ~ factor(ExteriorQ)
Df Sum of Sq RSS Cp
+ factor(BasementHt) 4 250784 1035761 154.74
+ factor(GarageType) 6 213599 1072946 185.08
+ factor(Foundation) 5 127908 1158637 243.79
+ factor(KitchenQ) 3 111799 1174746 251.20
+ factor(GarageC) 5 61878 1224668 290.57
+ factor(CentralAir) 1 49963 1236582 291.01
+ factor(GarageQ) 4 51506 1235039 295.91
+ factor(Condition) 7 56831 1229715 298.14
+ factor(HouseStyle) 6 51999 1234546 299.56
+ factor(BasementFin) 6 50313 1236232 300.76
+ factor(HeatingQC) 3 34665 1251880 305.84
+ factor(LotConfig) 4 37324 1249222 305.96
+ factor(BasementC) 4 31824 1254721 309.86
+ factor(Heating) 4 14719 1271826 321.97
+ factor(ExteriorC) 3 9120 1277425 323.94
<none> 1286545 324.40
Step: AIC=154.74
Price ~ factor(ExteriorQ) + factor(BasementHt)
Df Sum of Sq RSS Cp
+ factor(GarageType) 6 124591 911170 78.481
+ factor(KitchenQ) 3 58853 976908 119.050
+ factor(GarageQ) 4 38749 997012 135.292
+ factor(HouseStyle) 6 43956 991805 135.604
+ factor(GarageC) 5 38839 996922 137.228
+ factor(CentralAir) 1 23287 1012474 140.246
+ factor(Foundation) 5 32902 1002859 141.434
+ factor(LotConfig) 4 24518 1011243 145.374
+ factor(HeatingQC) 3 19358 1016403 147.029
+ factor(Condition) 7 29957 1005804 147.521
<none> 1035761 154.742
+ factor(ExteriorC) 3 6387 1029374 156.217
+ factor(Heating) 4 9177 1026584 156.241
+ factor(BasementC) 3 5603 1030159 156.773
+ factor(BasementFin) 5 9762 1025999 157.827
Step: AIC=78.48
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType)
Df Sum of Sq RSS Cp
+ factor(KitchenQ) 3 46077 865093 51.839
+ factor(HouseStyle) 6 33520 877650 66.735
+ factor(Condition) 7 30314 880856 71.006
+ factor(LotConfig) 4 17289 893881 74.233
+ factor(GarageQ) 3 13232 897938 75.107
+ factor(BasementC) 3 11289 899881 76.484
+ factor(CentralAir) 1 4507 906662 77.288
+ factor(HeatingQC) 3 9175 901995 77.981
+ factor(ExteriorC) 3 8888 902282 78.184
+ factor(Foundation) 5 14356 896814 78.311
<none> 911170 78.481
+ factor(GarageC) 4 7043 904127 81.491
+ factor(Heating) 4 5425 905744 82.637
+ factor(BasementFin) 5 6296 904874 84.021
Step: AIC=51.84
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType) +
factor(KitchenQ)
Df Sum of Sq RSS Cp
+ factor(HouseStyle) 6 29516.5 835576 42.930
+ factor(LotConfig) 4 16523.4 848569 48.134
+ factor(Condition) 7 20942.2 844151 51.004
+ factor(BasementC) 3 9421.5 855671 51.165
<none> 865093 51.839
+ factor(CentralAir) 1 2384.5 862708 52.150
+ factor(GarageQ) 3 7666.8 857426 52.408
+ factor(Foundation) 5 11629.4 853463 53.601
+ factor(ExteriorC) 3 4888.3 860204 54.376
+ factor(GarageC) 4 7585.1 857508 54.466
+ factor(HeatingQC) 3 2954.0 862139 55.747
+ factor(Heating) 4 3609.1 861484 57.283
+ factor(BasementFin) 5 4771.0 860322 58.459
Step: AIC=42.93
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType) +
factor(KitchenQ) + factor(HouseStyle)
Df Sum of Sq RSS Cp
+ factor(LotConfig) 4 19230.6 816346 37.306
+ factor(BasementC) 3 10071.9 825504 41.795
+ factor(Condition) 7 21361.3 814215 41.797
+ factor(CentralAir) 1 3576.7 832000 42.396
<none> 835576 42.930
+ factor(GarageQ) 3 7820.7 827756 43.389
+ factor(GarageC) 4 7299.8 828276 45.758
+ factor(ExteriorC) 3 3909.3 831667 46.160
+ factor(BasementFin) 5 8833.3 826743 46.672
+ factor(Foundation) 5 8749.2 826827 46.732
+ factor(HeatingQC) 3 2709.1 832867 47.010
+ factor(Heating) 4 2843.8 832732 48.915
Step: AIC=37.31
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType) +
factor(KitchenQ) + factor(HouseStyle) + factor(LotConfig)
Df Sum of Sq RSS Cp
+ factor(Condition) 7 21084.9 795261 36.370
+ factor(BasementC) 3 9549.4 806796 36.542
+ factor(CentralAir) 1 3152.2 813193 37.073
<none> 816346 37.306
+ factor(GarageQ) 3 5789.7 810556 39.205
+ factor(BasementFin) 5 9684.9 806661 40.446
+ factor(ExteriorC) 3 3928.0 812418 40.524
+ factor(Foundation) 5 9138.4 807207 40.833
+ factor(GarageC) 4 6011.9 810334 41.048
+ factor(HeatingQC) 3 2689.6 813656 41.401
+ factor(Heating) 4 2312.3 814033 43.669
Step: AIC=36.37
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType) +
factor(KitchenQ) + factor(HouseStyle) + factor(LotConfig) +
factor(Condition)
Df Sum of Sq RSS Cp
<none> 795261 36.370
+ factor(BasementC) 3 7759.2 787502 36.873
+ factor(CentralAir) 1 824.6 794436 37.786
+ factor(GarageQ) 3 5350.1 789911 38.580
+ factor(Foundation) 5 10767.8 784493 38.742
+ factor(ExteriorC) 3 3198.1 792063 40.104
+ factor(BasementFin) 5 8249.1 787012 40.526
+ factor(HeatingQC) 3 1677.2 793584 41.182
+ factor(GarageC) 4 3843.9 791417 41.647
+ factor(Heating) 4 2857.7 792403 42.345
Call:
lm(formula = Price ~ factor(ExteriorQ) + factor(BasementHt) +
factor(GarageType) + factor(KitchenQ) + factor(HouseStyle) +
factor(LotConfig) + factor(Condition), data = AmesTrain6a)
Coefficients:
(Intercept) factor(ExteriorQ)Fa factor(ExteriorQ)Gd factor(ExteriorQ)TA
371.587 -127.471 -64.446 -97.138
factor(BasementHt)Fa factor(BasementHt)Gd factor(BasementHt)None factor(BasementHt)TA
-101.217 -61.932 -93.392 -80.750
factor(GarageType)Attchd factor(GarageType)Basment factor(GarageType)BuiltIn factor(GarageType)CarPort
-15.095 -33.555 -7.098 -67.326
factor(GarageType)Detchd factor(GarageType)None factor(KitchenQ)Fa factor(KitchenQ)Gd
-46.666 -47.290 -55.587 -34.705
factor(KitchenQ)TA factor(HouseStyle)1.5Unf factor(HouseStyle)1Story factor(HouseStyle)2.5Unf
-43.160 -12.327 -2.912 32.865
factor(HouseStyle)2Story factor(HouseStyle)SFoyer factor(HouseStyle)SLvl factor(LotConfig)CulDSac
9.627 -28.005 -8.037 9.395
factor(LotConfig)FR2 factor(LotConfig)FR3 factor(LotConfig)Inside factor(Condition)3
-20.856 12.616 -7.700 -6.741
factor(Condition)4 factor(Condition)5 factor(Condition)6 factor(Condition)7
7.656 22.628 20.582 24.883
factor(Condition)8 factor(Condition)9
32.250 41.650
modTransformCatBackward=lm(Price~factor(HouseStyle)+factor(ExteriorQ)+factor(BasementFin)+factor(HeatingQC)+factor(KitchenQ)+factor(ExteriorC)+factor(CentralAir)+factor(GarageQ)+factor(Foundation)+factor(GarageC)+factor(BasementHt)+factor(GarageType)+factor(LotConfig)+factor(BasementC)+factor(Heating)+factor(Condition), data=AmesTrain6a)
MSE=(summary(modTransformCatBackward)$sigma)^2
step(modTransformCatBackward,scale=MSE)
Start: AIC=65
Price ~ factor(HouseStyle) + factor(ExteriorQ) + factor(BasementFin) +
factor(HeatingQC) + factor(KitchenQ) + factor(ExteriorC) +
factor(CentralAir) + factor(GarageQ) + factor(Foundation) +
factor(GarageC) + factor(BasementHt) + factor(GarageType) +
factor(LotConfig) + factor(BasementC) + factor(Heating) +
factor(Condition)
Df Sum of Sq RSS Cp
- factor(Foundation) 5 5666 753821 59.014
- factor(BasementFin) 5 6638 754793 59.702
- factor(HeatingQC) 3 1468 749623 60.040
- factor(Heating) 4 4527 752683 60.207
- factor(GarageC) 4 7684 755840 62.444
- factor(ExteriorC) 3 5741 753897 63.067
- factor(Condition) 7 17183 765339 63.173
- factor(CentralAir) 1 528 748684 63.374
- factor(BasementC) 3 7457 755613 64.283
<none> 748156 65.000
- factor(GarageQ) 3 9357 757513 65.629
- factor(LotConfig) 4 14998 763154 67.625
- factor(HouseStyle) 6 30084 778239 74.312
- factor(KitchenQ) 3 24715 772870 76.508
- factor(GarageType) 5 66239 814395 101.925
- factor(ExteriorQ) 3 82469 830624 117.421
- factor(BasementHt) 3 84383 832539 118.778
Step: AIC=59.01
Price ~ factor(HouseStyle) + factor(ExteriorQ) + factor(BasementFin) +
factor(HeatingQC) + factor(KitchenQ) + factor(ExteriorC) +
factor(CentralAir) + factor(GarageQ) + factor(GarageC) +
factor(BasementHt) + factor(GarageType) + factor(LotConfig) +
factor(BasementC) + factor(Heating) + factor(Condition)
Df Sum of Sq RSS Cp
- factor(BasementFin) 5 6861 760682 53.874
- factor(HeatingQC) 3 1864 755686 54.334
- factor(Heating) 4 6243 760065 55.437
- factor(GarageC) 4 8523 762344 57.051
- factor(CentralAir) 1 784 754605 57.569
- factor(Condition) 7 18067 771889 57.813
- factor(ExteriorC) 3 6914 760736 57.912
- factor(BasementC) 3 8469 762290 59.013
<none> 753821 59.014
- factor(GarageQ) 3 10841 764662 60.693
- factor(LotConfig) 4 14638 768459 61.383
- factor(HouseStyle) 6 33978 787799 71.084
- factor(KitchenQ) 3 25739 779560 71.247
- factor(GarageType) 5 71777 825598 99.861
- factor(ExteriorQ) 3 84305 838126 112.736
- factor(BasementHt) 3 96056 849877 121.060
Step: AIC=53.87
Price ~ factor(HouseStyle) + factor(ExteriorQ) + factor(HeatingQC) +
factor(KitchenQ) + factor(ExteriorC) + factor(CentralAir) +
factor(GarageQ) + factor(GarageC) + factor(BasementHt) +
factor(GarageType) + factor(LotConfig) + factor(BasementC) +
factor(Heating) + factor(Condition)
Df Sum of Sq RSS Cp
- factor(HeatingQC) 3 1799 762481 49.148
- factor(Heating) 4 6187 766869 50.257
- factor(GarageC) 4 8164 768845 51.657
- factor(CentralAir) 1 1036 761717 52.607
- factor(ExteriorC) 3 7228 767909 52.994
- factor(Condition) 7 19434 780115 53.641
<none> 760682 53.874
- factor(BasementC) 3 10045 770726 54.989
- factor(GarageQ) 3 10341 771023 55.199
- factor(LotConfig) 4 14118 774799 55.875
- factor(HouseStyle) 6 29978 790659 63.110
- factor(KitchenQ) 3 27319 788001 67.227
- factor(GarageType) 5 73113 833795 95.668
- factor(ExteriorQ) 3 84841 845523 107.976
- factor(BasementHt) 3 107894 868575 124.306
Step: AIC=49.15
Price ~ factor(HouseStyle) + factor(ExteriorQ) + factor(KitchenQ) +
factor(ExteriorC) + factor(CentralAir) + factor(GarageQ) +
factor(GarageC) + factor(BasementHt) + factor(GarageType) +
factor(LotConfig) + factor(BasementC) + factor(Heating) +
factor(Condition)
Df Sum of Sq RSS Cp
- factor(Heating) 4 6115 768596 45.480
- factor(GarageC) 4 7797 770278 46.672
- factor(ExteriorC) 3 7070 769551 48.157
- factor(CentralAir) 1 1477 763958 48.195
- factor(Condition) 7 19403 781884 48.894
<none> 762481 49.148
- factor(BasementC) 3 9962 772444 50.206
- factor(GarageQ) 3 10426 772907 50.534
- factor(LotConfig) 4 14161 776642 51.180
- factor(HouseStyle) 6 31376 793857 59.376
- factor(KitchenQ) 3 29453 791935 64.013
- factor(GarageType) 5 74425 836906 91.871
- factor(ExteriorQ) 3 88671 851152 105.963
- factor(BasementHt) 3 108927 871408 120.313
Step: AIC=45.48
Price ~ factor(HouseStyle) + factor(ExteriorQ) + factor(KitchenQ) +
factor(ExteriorC) + factor(CentralAir) + factor(GarageQ) +
factor(GarageC) + factor(BasementHt) + factor(GarageType) +
factor(LotConfig) + factor(BasementC) + factor(Condition)
Df Sum of Sq RSS Cp
- factor(GarageC) 4 7439 776035 42.750
- factor(ExteriorC) 3 4953 773549 42.989
- factor(CentralAir) 1 515 769111 43.845
- factor(Condition) 7 17492 786088 43.872
<none> 768596 45.480
- factor(GarageQ) 3 9250 777846 46.033
- factor(BasementC) 3 10096 778692 46.632
- factor(LotConfig) 4 15422 784018 48.405
- factor(HouseStyle) 6 31180 799776 55.569
- factor(KitchenQ) 3 31307 799903 61.658
- factor(GarageType) 5 74819 843415 88.483
- factor(ExteriorQ) 3 93092 861688 105.427
- factor(BasementHt) 3 106667 875263 115.044
Step: AIC=42.75
Price ~ factor(HouseStyle) + factor(ExteriorQ) + factor(KitchenQ) +
factor(ExteriorC) + factor(CentralAir) + factor(GarageQ) +
factor(BasementHt) + factor(GarageType) + factor(LotConfig) +
factor(BasementC) + factor(Condition)
Df Sum of Sq RSS Cp
- factor(ExteriorC) 3 4529 780565 39.959
- factor(GarageQ) 3 6360 782395 41.256
- factor(CentralAir) 1 770 776805 41.296
- factor(Condition) 7 18432 794467 41.808
<none> 776035 42.750
- factor(BasementC) 3 9778 785813 43.677
- factor(LotConfig) 4 16348 792384 46.332
- factor(HouseStyle) 6 32643 808678 53.875
- factor(KitchenQ) 3 27594 803629 56.298
- factor(GarageType) 5 80955 856991 90.100
- factor(ExteriorQ) 3 101505 877541 108.658
- factor(BasementHt) 3 112313 888348 116.314
Step: AIC=39.96
Price ~ factor(HouseStyle) + factor(ExteriorQ) + factor(KitchenQ) +
factor(CentralAir) + factor(GarageQ) + factor(BasementHt) +
factor(GarageType) + factor(LotConfig) + factor(BasementC) +
factor(Condition)
Df Sum of Sq RSS Cp
- factor(Condition) 7 16942 797507 37.961
- factor(GarageQ) 3 5865 786430 38.114
- factor(CentralAir) 1 785 781349 38.515
<none> 780565 39.959
- factor(BasementC) 3 8757 789322 40.163
- factor(LotConfig) 4 15985 796550 43.283
- factor(HouseStyle) 6 32949 813514 51.300
- factor(KitchenQ) 3 27491 808056 53.434
- factor(GarageType) 5 85120 865684 90.258
- factor(ExteriorQ) 3 100216 880781 104.953
- factor(BasementHt) 3 114195 894760 114.856
Step: AIC=37.96
Price ~ factor(HouseStyle) + factor(ExteriorQ) + factor(KitchenQ) +
factor(CentralAir) + factor(GarageQ) + factor(BasementHt) +
factor(GarageType) + factor(LotConfig) + factor(BasementC)
Df Sum of Sq RSS Cp
- factor(GarageQ) 3 6059 803566 36.254
- factor(CentralAir) 1 2686 800193 37.864
<none> 797507 37.961
- factor(BasementC) 3 10393 807900 39.324
- factor(LotConfig) 4 15974 813482 41.278
- factor(HouseStyle) 6 34061 831568 50.090
- factor(KitchenQ) 3 31930 829437 54.581
- factor(GarageType) 5 84122 881629 87.554
- factor(ExteriorQ) 3 111475 908982 110.931
- factor(BasementHt) 3 115425 912932 113.729
Step: AIC=36.25
Price ~ factor(HouseStyle) + factor(ExteriorQ) + factor(KitchenQ) +
factor(CentralAir) + factor(BasementHt) + factor(GarageType) +
factor(LotConfig) + factor(BasementC)
Df Sum of Sq RSS Cp
<none> 803566 36.254
- factor(CentralAir) 1 3230 806796 36.542
- factor(BasementC) 3 9627 813193 37.073
- factor(LotConfig) 4 18042 821609 41.035
- factor(HouseStyle) 6 33772 837338 48.178
- factor(KitchenQ) 3 36568 840134 56.158
- factor(GarageType) 6 91186 894753 88.851
- factor(ExteriorQ) 3 110197 913763 108.318
- factor(BasementHt) 3 114370 917937 111.275
Call:
lm(formula = Price ~ factor(HouseStyle) + factor(ExteriorQ) +
factor(KitchenQ) + factor(CentralAir) + factor(BasementHt) +
factor(GarageType) + factor(LotConfig) + factor(BasementC),
data = AmesTrain6a)
Coefficients:
(Intercept) factor(HouseStyle)1.5Unf factor(HouseStyle)1Story factor(HouseStyle)2.5Unf
405.314 -8.381 -1.405 32.462
factor(HouseStyle)2Story factor(HouseStyle)SFoyer factor(HouseStyle)SLvl factor(ExteriorQ)Fa
13.102 -21.945 -5.033 -130.500
factor(ExteriorQ)Gd factor(ExteriorQ)TA factor(KitchenQ)Fa factor(KitchenQ)Gd
-66.651 -100.787 -55.427 -33.385
factor(KitchenQ)TA factor(CentralAir)Y factor(BasementHt)Fa factor(BasementHt)Gd
-44.764 10.535 -98.775 -60.523
factor(BasementHt)None factor(BasementHt)TA factor(GarageType)Attchd factor(GarageType)Basment
-110.859 -76.305 -19.688 -38.752
factor(GarageType)BuiltIn factor(GarageType)CarPort factor(GarageType)Detchd factor(GarageType)None
-12.034 -69.540 -50.137 -50.481
factor(LotConfig)CulDSac factor(LotConfig)FR2 factor(LotConfig)FR3 factor(LotConfig)Inside
9.261 -20.069 13.020 -7.452
factor(BasementC)Fa factor(BasementC)Gd factor(BasementC)None factor(BasementC)TA
-12.318 -0.857 NA -20.085
modCatReduced = lm (Price ~ factor(ExteriorQ) + factor(BasementHt) +
factor(GarageType) + factor(KitchenQ) + factor(HouseStyle) +
factor(LotConfig) + factor(Condition), data = AmesTrain6a)
modCatFull=lm(Price~factor(HouseStyle)+factor(ExteriorQ)+factor(BasementFin)+factor(HeatingQC)+factor(KitchenQ)+factor(ExteriorC)+factor(CentralAir)+factor(GarageQ)+factor(Foundation)+factor(GarageC)+factor(BasementHt)+factor(GarageType)+factor(LotConfig)+factor(BasementC)+factor(Heating)+factor(Condition), data=AmesTrain6a)
Cp455(modCatReduced, modCatFull)
[1] 36.36975
After doing forward, backward, and stepwise regression, we’re going to use the categorical models that forward and stepwise gave us because of the low AIC and Mallow Cp values. The AIC for forward and stepwise was 36.37 and the AIC for backward selection was 36.25. Although the backward selection AIC was slightly lower, we chose to use the factors suggested by forward and stepwise selection because they agreed, and also because we may do a second round of selection when we combine these variables with the numerical variables. This means that we will include ExteriorQ, BasementHt, GarageType, KitchenQ, HouseStyle, LotConfig, and Condition in the next model. Additionally, the CP for the model suggested by stepwise selection is 36.369, which is substantially more than the number of variables in the model, which means the model isn’t very efficient. With that said, this CP is about the same between forward, backward, and stepwise selection, and may improve when we include numerical variables. Furthermore, a higher CP makes sense for when this many variables are introduced into a model.
We chose these predictors to transform based on the stepwise, backward, and forward selection that we did for Assignment #3
modLotArea=lm(Price~LotArea, data=AmesTrain6a)
modLotAreaSquared=lm(Price~LotArea+I(LotArea^2), data=AmesTrain6a)
modLotAreaSqrt=lm(Price~LotArea+I(sqrt(LotArea)), data=AmesTrain6a)
modLotAreaLog=lm(Price~(log(LotArea)), data=AmesTrain6a)
modLotAreaAll=lm(Price~LotArea+I(LotArea^2)+I(sqrt(LotArea))+I(log(LotArea)), data=AmesTrain6a)
anova(modLotArea, modLotAreaSquared, modLotAreaSqrt, modLotAreaLog, modLotAreaAll)
Analysis of Variance Table
Model 1: Price ~ LotArea
Model 2: Price ~ LotArea + I(LotArea^2)
Model 3: Price ~ LotArea + I(sqrt(LotArea))
Model 4: Price ~ (log(LotArea))
Model 5: Price ~ LotArea + I(LotArea^2) + I(sqrt(LotArea)) + I(log(LotArea))
Res.Df RSS Df Sum of Sq F Pr(>F)
1 593 2602283
2 592 2523892 1 78391 18.9784 1.559e-05 ***
3 592 2567183 0 -43291
4 593 2582422 -1 -15238 3.6892 0.05525 .
5 590 2437001 3 145420 11.7355 1.776e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
plot(modLotArea$residuals~modLotArea$fitted.values)
abline(0,0)
plot(modLotAreaSquared$residuals~modLotAreaSquared$fitted.values)
abline(0,0)
This data shows that we should use LotArea^2 as our transformation because it has the best balance between the number of variables used and a significant p-value.
modYearBuilt=lm(Price~YearBuilt, data=AmesTrain6a)
modYearBuiltSquared=lm(Price~YearBuilt+I(YearBuilt^2), data=AmesTrain6a)
modYearBuiltSqrt=lm(Price~YearBuilt+I(sqrt(YearBuilt)), data=AmesTrain6a)
modYearBuiltLog=lm(Price~(log(YearBuilt)), data=AmesTrain6a)
modYearBuiltFull=lm(Price)~YearBuilt+I(YearBuilt^2)+I(sqrt(YearBuilt))+I(log(YearBuilt), data=AmesTrain6a)
anova(modYearBuilt, modYearBuiltSquared, modYearBuiltSqrt, modYearBuiltLog, modYearBuiltFull)
models with response ‘"NULL"’ removed because response differs from model 1
Analysis of Variance Table
Model 1: Price ~ YearBuilt
Model 2: Price ~ YearBuilt + I(YearBuilt^2)
Model 3: Price ~ YearBuilt + I(sqrt(YearBuilt))
Model 4: Price ~ (log(YearBuilt))
Res.Df RSS Df Sum of Sq F Pr(>F)
1 593 1896928
2 592 1624934 1 271994 99.094 < 2.2e-16 ***
3 592 1626098 0 -1165
4 593 1906324 -1 -280226 102.093 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
plot(modYearBuilt$residuals~modYearBuilt$fitted.values)
abline(0,0)
plot(modYearBuiltSquared$residuals~modYearBuiltSquared$fitted.values)
abline(0,0)
This shows that we should use YearBuilt^2 as our transformation because it has the best balance between the number of variables used and a significant p-value.
modWoodDeckSF=lm(Price~WoodDeckSF, data=AmesTrain6a)
modWoodDeckSFSquared=lm(Price~WoodDeckSF+I(WoodDeckSF^2), data=AmesTrain6a)
modWoodDeckSFSqrt=lm(Price~WoodDeckSF+I(sqrt(WoodDeckSF)), data=AmesTrain6a)
modWoodDeckSFLog=lm(Price~(log(WoodDeckSF+1)), data=AmesTrain6a)
modWoodDeckSFFull=lm(Price)~WoodDeckSF+I(WoodDeckSF^2)+I(sqrt(WoodDeckSF))+I(log(WoodDeckSF+1), data=AmesTrain6a)
anova(modWoodDeckSF, modWoodDeckSFSquared, modWoodDeckSFSqrt, modWoodDeckSFLog, modWoodDeckSFFull)
models with response ‘"NULL"’ removed because response differs from model 1
Analysis of Variance Table
Model 1: Price ~ WoodDeckSF
Model 2: Price ~ WoodDeckSF + I(WoodDeckSF^2)
Model 3: Price ~ WoodDeckSF + I(sqrt(WoodDeckSF))
Model 4: Price ~ (log(WoodDeckSF + 1))
Res.Df RSS Df Sum of Sq F Pr(>F)
1 593 2542776
2 592 2485078 1 57698 13.7450 0.000229 ***
3 592 2498904 0 -13825
4 593 2521822 -1 -22918 5.4596 0.019794 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
plot(modWoodDeckSF$residuals~modWoodDeckSF$fitted.values)
abline(0,0)
plot(modWoodDeckSFSquared$residuals~modWoodDeckSFSquared$fitted.values)
abline(0,0)
This shows that we should use WoodDeck^2 as our transformation because it has the best balance between the number of variables used and a significant p-value.
modGroundSF=lm(Price~GroundSF, data=AmesTrain6a)
modGroundSFSquared=lm(Price~GroundSF+I(GroundSF^2), data=AmesTrain6a)
modGroundSFSqrt=lm(Price~GroundSF+I(sqrt(GroundSF)), data=AmesTrain6a)
modGroundSFLog=lm(Price~(log(GroundSF+1)), data=AmesTrain6a)
modGroundSFFull=lm(Price)~GroundSFF+I(GroundSF^2)+I(sqrt(GroundSF))+I(log(GroundSF+1), data=AmesTrain6a)
anova(modGroundSF, modGroundSFSquared, modGroundSFSqrt, modGroundSFLog, modGroundSFFull)
models with response ‘"NULL"’ removed because response differs from model 1
Analysis of Variance Table
Model 1: Price ~ GroundSF
Model 2: Price ~ GroundSF + I(GroundSF^2)
Model 3: Price ~ GroundSF + I(sqrt(GroundSF))
Model 4: Price ~ (log(GroundSF + 1))
Res.Df RSS Df Sum of Sq F Pr(>F)
1 593 1465476
2 592 1462787 1 2689 1.0884 0.2973
3 592 1463302 0 -515
4 593 1541522 -1 -78221 31.6564 2.841e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
plot(modGroundSF$residuals~modGroundSF$fitted.values)
abline(0,0)
plot(modGroundSFSquared$residuals~modGroundSFSquared$fitted.values)
abline(0,0)
We’re keeping GroundSF as our transformation because it has the best balance between the number of variables used and a significant p-value.
modFullBath = lm(Price~FullBath, data = AmesTrain6a)
modFullBathSquared= lm(Price ~ FullBath+I(FullBath^2), data = AmesTrain6a)
modFullBathSqrt= lm(Price~FullBath+I(sqrt(FullBath)), data=AmesTrain6a)
modFullBathLog= lm(Price~(log(FullBath+1)), data=AmesTrain6a)
modFullBathFull = lm(Price~FullBath+I(FullBath^2)+I(sqrt(FullBath))+I(log(FullBath+1)), data=AmesTrain6a)
anova(modFullBath, modFullBathSquared, modFullBathSqrt, modFullBathLog, modFullBathFull)
Analysis of Variance Table
Model 1: Price ~ FullBath
Model 2: Price ~ FullBath + I(FullBath^2)
Model 3: Price ~ FullBath + I(sqrt(FullBath))
Model 4: Price ~ (log(FullBath + 1))
Model 5: Price ~ FullBath + I(FullBath^2) + I(sqrt(FullBath)) + I(log(FullBath +
1))
Res.Df RSS Df Sum of Sq F Pr(>F)
1 593 1982921
2 592 1982652 1 269 0.0808 0.776323
3 592 1973813 0 8839
4 593 2007246 -1 -33433 10.0419 0.001609 **
5 591 1967617 2 39628 5.9514 0.002761 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
plot(modFullBath$residuals~modFullBath$fitted.values)
abline(0,0)
plot(modFullBathSquared$residuals~modFullBathSquared$fitted.values)
abline(0,0)
This shows that we should use FullBath^2 as our transformation because it has the best balance between the number of variables used and a significant p-value.
modTotalRooms = lm(Price ~ TotalRooms, data = AmesTrain6a)
modTotalRoomsSquared = lm(Price ~ TotalRooms+I(TotalRooms^2), data = AmesTrain6a)
modTotalRoomsSqrt = lm(Price~TotalRooms+I(sqrt(TotalRooms)), data=AmesTrain6a)
modTotalRoomsLog = lm(Price~(log(TotalRooms+1)), data=AmesTrain6a)
modTotalRoomsFull = lm(Price~TotalRooms+I(TotalRooms^2)+I(sqrt(TotalRooms))+I(log(TotalRooms+1)), data=AmesTrain6a)
anova(modTotalRooms, modTotalRoomsSquared, modTotalRoomsSqrt, modTotalRoomsLog, modTotalRoomsFull)
Analysis of Variance Table
Model 1: Price ~ TotalRooms
Model 2: Price ~ TotalRooms + I(TotalRooms^2)
Model 3: Price ~ TotalRooms + I(sqrt(TotalRooms))
Model 4: Price ~ (log(TotalRooms + 1))
Model 5: Price ~ TotalRooms + I(TotalRooms^2) + I(sqrt(TotalRooms)) +
I(log(TotalRooms + 1))
Res.Df RSS Df Sum of Sq F Pr(>F)
1 593 2280320
2 592 2280004 1 315.2 0.0823 0.77434
3 592 2280299 0 -294.6
4 593 2291422 -1 -11123.3 2.9036 0.08891 .
5 590 2260256 3 31166.9 2.7119 0.04425 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
plot(modTotalRooms$residuals~modTotalRooms$fitted.values)
abline(0,0)
plot(modTotalRoomsSquared$residuals~modTotalRoomsSquared$fitted.values)
abline(0,0)
This shows that we should use TotalRooms + I(TotalRooms^2) + I(sqrt(TotalRooms)) + I(log(TotalRooms + 1)) as our transformation because it is the only one with a signficant p-value. We may not include this variable in a final model because of how many variables it creates and relies on.
modBasementSF=lm(Price~BasementSF, data=AmesTrain6a)
modBasementSFSquared=lm(Price~BasementSF+I(BasementSF^2), data=AmesTrain6a)
modBasementSFSqrt=lm(Price~BasementSF+I(sqrt(BasementSF)), data=AmesTrain6a)
modBasementSFLog=lm(Price~(log(BasementSF+1)), data=AmesTrain6a)
modBasementSFFull=lm(Price~BasementSF+I(BasementSF^2)+I(sqrt(BasementSF))+I(log(BasementSF+1)), data=AmesTrain6a)
anova(modBasementSF, modBasementSFSquared, modBasementSFSqrt, modBasementSFLog, modBasementSFFull)
Analysis of Variance Table
Model 1: Price ~ BasementSF
Model 2: Price ~ BasementSF + I(BasementSF^2)
Model 3: Price ~ BasementSF + I(sqrt(BasementSF))
Model 4: Price ~ (log(BasementSF + 1))
Model 5: Price ~ BasementSF + I(BasementSF^2) + I(sqrt(BasementSF)) +
I(log(BasementSF + 1))
Res.Df RSS Df Sum of Sq F Pr(>F)
1 593 1882672
2 592 1783033 1 99639 33.025 1.459e-08 ***
3 592 1800490 0 -17457
4 593 2661798 -1 -861308 285.477 < 2.2e-16 ***
5 590 1780081 3 881717 97.414 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
plot(modBasementSF$residuals~modBasementSF$fitted.values)
abline(0,0)
plot(modBasementSFSquared$residuals~modBasementSFSquared$fitted.values)
abline(0,0)
This shows that we should use log(BasementSF) as our transformation because it has the best balance between the number of variables used and a significant p-value.
modGarageCars=lm(Price~GarageCars, data=AmesTrain6a)
modGarageCarsSquared=lm(Price~GarageCars+I(GarageCars^2), data=AmesTrain6a)
modGarageCarsSqrt=lm(Price~GarageCars+I(sqrt(GarageCars)), data=AmesTrain6a)
modGarageCarsLog=lm(Price~(log(GarageCars+1)), data=AmesTrain6a)
modGarageCarsFull=lm(Price~GarageCars+I(GarageCars^2)+I(sqrt(GarageCars))+I(log(GarageCars+1)), data=AmesTrain6a)
anova(modGarageCars, modGarageCarsSquared, modGarageCarsSqrt, modGarageCarsLog, modGarageCarsFull)
Analysis of Variance Table
Model 1: Price ~ GarageCars
Model 2: Price ~ GarageCars + I(GarageCars^2)
Model 3: Price ~ GarageCars + I(sqrt(GarageCars))
Model 4: Price ~ (log(GarageCars + 1))
Model 5: Price ~ GarageCars + I(GarageCars^2) + I(sqrt(GarageCars)) +
I(log(GarageCars + 1))
Res.Df RSS Df Sum of Sq F Pr(>F)
1 593 1774895
2 592 1612893 1 162002 60.288 3.639e-14 ***
3 592 1650724 0 -37831
4 593 2004694 -1 -353970 131.728 < 2.2e-16 ***
5 590 1585410 3 419284 52.011 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
plot(modGarageCars$residuals~modGarageCars$fitted.values)
abline(0,0)
plot(modGarageCarsSquared$residuals~modGarageCarsSquared$fitted.values)
abline(0,0)
This shows that we should use modGarageCars^2 as our transformation because it has the best balance between the number of variables used and a significant p-value.
modTransformNumericLog=lm(log(Price)~LotArea+I(LotArea^2)+YearBuilt+I(YearBuilt^2)+BasementSF+I(BasementSF^2)+GarageCars+I(GarageCars^2)+WoodDeckSF+I(WoodDeckSF^2)+GroundSF+I(GroundSF^2)+FullBath+TotalRooms, data=AmesTrain6a)
summary(modTransformNumericLog)
modTransformNumeric=lm(Price~LotArea+I(LotArea^2)+YearBuilt+I(YearBuilt^2)+BasementSF+I(BasementSF^2)+GarageCars+I(GarageCars^2)+WoodDeckSF+I(WoodDeckSF^2)+GroundSF+I(GroundSF^2)+FullBath+TotalRooms, data=AmesTrain6a)
summary(modTransformNumeric)
modTransformCatLog=lm(log(Price)~factor(HouseStyle)+factor(ExteriorQ)+factor(BasementFin)+factor(HeatingQC)+factor(KitchenQ)+factor(ExteriorC)+factor(CentralAir)+factor(GarageQ)+factor(Foundation)+factor(GarageC)+factor(BasementHt)+factor(GarageType)+factor(LotConfig)+factor(BasementC)+factor(Heating)+factor(Condition), data=AmesTrain6a)
summary(modTransformCatLog)
modTransformCat=lm(Price~factor(HouseStyle)+factor(ExteriorQ)+factor(BasementFin)+factor(HeatingQC)+factor(KitchenQ)+factor(ExteriorC)+factor(CentralAir)+factor(GarageQ)+factor(Foundation)+factor(GarageC)+factor(BasementHt)+factor(GarageType)+factor(LotConfig)+factor(BasementC)+factor(Heating)+factor(Condition), data=AmesTrain6a)
summary(modTransformCat)
We decided not to log the response because the results of the logged Price were not significantly different. Although the adjusted r-squared was slightly better (.001) for log(Price) than Price with numeric variables, Price was significantly better than log(Price) for Categorical variables (.02). Additionally, we didn’t want to overfit the data through too many transformations, so we chose to keep Price as the response variable.
modAllBathroom=lm(Price~FullBath+BasementFBath+0.5*BasementHBath+0.5*HalfBath, data-AmesTrain6a)
summary(modAllBathroom)
We chose not to combine any of the variables because they didn’t significantly improve the model. For example, experimental combinations with the different bath variables didn’t improve the adjusted r-squared value while also lowering AIC and Mallow Cp.
modTransformCat=lm(Price~factor(HouseStyle)+factor(ExteriorQ)+factor(BasementFin)+factor(HeatingQC)+factor(KitchenQ)+factor(ExteriorC)+factor(CentralAir)+factor(GarageQ)+factor(Foundation)+factor(GarageC)+factor(BasementHt)+factor(GarageType)+factor(LotConfig)+factor(BasementC)+factor(Heating)+factor(Condition), data=AmesTrain6a)
summary(modTransformCat)
Call:
lm(formula = Price ~ factor(HouseStyle) + factor(ExteriorQ) +
factor(BasementFin) + factor(HeatingQC) + factor(KitchenQ) +
factor(ExteriorC) + factor(CentralAir) + factor(GarageQ) +
factor(Foundation) + factor(GarageC) + factor(BasementHt) +
factor(GarageType) + factor(LotConfig) + factor(BasementC) +
factor(Heating) + factor(Condition), data = AmesTrain6a)
Residuals:
Min 1Q Median 3Q Max
-96.316 -23.460 -0.895 18.138 154.613
Coefficients: (4 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 276.8311 87.8232 3.152 0.001712 **
factor(HouseStyle)1.5Unf -4.5273 20.4211 -0.222 0.824633
factor(HouseStyle)1Story -0.0779 6.5346 -0.012 0.990493
factor(HouseStyle)2.5Unf 42.7315 23.8827 1.789 0.074150 .
factor(HouseStyle)2Story 11.6310 7.0351 1.653 0.098862 .
factor(HouseStyle)SFoyer -24.9542 12.1385 -2.056 0.040291 *
factor(HouseStyle)SLvl -7.8712 9.2634 -0.850 0.395873
factor(ExteriorQ)Fa -127.2075 23.2606 -5.469 6.99e-08 ***
factor(ExteriorQ)Gd -62.8435 13.3102 -4.721 3.00e-06 ***
factor(ExteriorQ)TA -94.3985 14.1932 -6.651 7.27e-11 ***
factor(BasementFin)BLQ 0.8396 7.0710 0.119 0.905529
factor(BasementFin)GLQ 3.5429 5.5164 0.642 0.520991
factor(BasementFin)LwQ 1.7849 8.8851 0.201 0.840869
factor(BasementFin)None -110.4109 42.1347 -2.620 0.009034 **
factor(BasementFin)Rec 1.2895 7.3736 0.175 0.861240
factor(BasementFin)Unf -6.0051 5.3489 -1.123 0.262085
factor(HeatingQC)Fa -0.6798 11.2329 -0.061 0.951765
factor(HeatingQC)Gd 4.0802 5.3941 0.756 0.449737
factor(HeatingQC)TA -1.1773 4.9969 -0.236 0.813823
factor(KitchenQ)Fa -52.7203 15.1922 -3.470 0.000562 ***
factor(KitchenQ)Gd -34.8167 9.5749 -3.636 0.000304 ***
factor(KitchenQ)TA -40.5129 10.1530 -3.990 7.53e-05 ***
factor(ExteriorC)Fa 4.3272 31.9069 0.136 0.892173
factor(ExteriorC)Gd -25.8265 26.4996 -0.975 0.330203
factor(ExteriorC)TA -19.6775 27.3093 -0.721 0.471509
factor(CentralAir)Y 5.2874 8.6428 0.612 0.540953
factor(GarageQ)Gd 61.5677 27.1955 2.264 0.023984 *
factor(GarageQ)None 46.7622 50.8032 0.920 0.357753
factor(GarageQ)Po 49.5994 70.6183 0.702 0.482764
factor(GarageQ)TA 0.1246 10.8491 0.011 0.990844
factor(Foundation)CBlock 1.7521 7.6006 0.231 0.817772
factor(Foundation)PConc 11.4451 8.6758 1.319 0.187676
factor(Foundation)Slab 12.2294 22.4476 0.545 0.586120
factor(Foundation)Stone 18.3872 24.4244 0.753 0.451891
factor(Foundation)Wood 24.3737 29.0463 0.839 0.401774
factor(GarageC)Fa 85.3064 48.2085 1.770 0.077380 .
factor(GarageC)Gd 98.6830 50.8363 1.941 0.052765 .
factor(GarageC)None NA NA NA NA
factor(GarageC)Po 88.2301 55.9473 1.577 0.115387
factor(GarageC)TA 98.0079 46.5230 2.107 0.035617 *
factor(BasementHt)Fa -89.7863 14.3966 -6.237 9.14e-10 ***
factor(BasementHt)Gd -58.4549 8.2455 -7.089 4.33e-12 ***
factor(BasementHt)None NA NA NA NA
factor(BasementHt)TA -72.2207 10.0079 -7.216 1.86e-12 ***
factor(GarageType)Attchd -17.1134 18.0483 -0.948 0.343460
factor(GarageType)Basment -32.2733 23.4157 -1.378 0.168700
factor(GarageType)BuiltIn -5.6891 19.0984 -0.298 0.765907
factor(GarageType)CarPort -58.3085 45.1565 -1.291 0.197178
factor(GarageType)Detchd -46.0371 18.1132 -2.542 0.011317 *
factor(GarageType)None NA NA NA NA
factor(LotConfig)CulDSac 7.4004 7.6958 0.962 0.336682
factor(LotConfig)FR2 -20.1713 8.8796 -2.272 0.023509 *
factor(LotConfig)FR3 10.9377 20.5572 0.532 0.594905
factor(LotConfig)Inside -7.3204 4.6136 -1.587 0.113172
factor(BasementC)Fa -7.9727 39.4513 -0.202 0.839924
factor(BasementC)Gd -1.8171 39.2841 -0.046 0.963124
factor(BasementC)None NA NA NA NA
factor(BasementC)TA -18.5301 38.3507 -0.483 0.629172
factor(Heating)GasW 24.5236 17.0449 1.439 0.150806
factor(Heating)Grav -43.9947 45.3212 -0.971 0.332125
factor(Heating)OthW -1.8212 41.8501 -0.044 0.965306
factor(Heating)Wall 8.0581 32.3359 0.249 0.803302
factor(Condition)3 0.4042 51.1716 0.008 0.993701
factor(Condition)4 22.0054 50.2147 0.438 0.661400
factor(Condition)5 35.7435 50.0151 0.715 0.475137
factor(Condition)6 33.5797 50.1292 0.670 0.503237
factor(Condition)7 36.8567 50.2968 0.733 0.464014
factor(Condition)8 44.5512 50.5200 0.882 0.378256
factor(Condition)9 49.7814 52.3594 0.951 0.342158
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 37.57 on 530 degrees of freedom
Multiple R-squared: 0.7414, Adjusted R-squared: 0.7102
F-statistic: 23.75 on 64 and 530 DF, p-value: < 2.2e-16
MSE=(summary(modTransformCat)$sigma)^2
step(none,scope=list(upper=modTransformCat),scale=MSE)
Start: AIC=1456.74
Price ~ 1
Df Sum of Sq RSS Cp
+ factor(ExteriorQ) 3 1606904 1286545 324.40
+ factor(BasementHt) 4 1555913 1337536 362.52
+ factor(KitchenQ) 3 1305178 1588271 538.14
+ factor(Foundation) 5 973531 1919917 777.09
+ factor(GarageType) 6 737589 2155859 946.23
+ factor(HeatingQC) 3 635156 2258293 1012.79
+ factor(Condition) 7 546078 2347371 1083.90
+ factor(BasementFin) 6 515220 2378228 1103.76
+ factor(HouseStyle) 6 230783 2662666 1305.26
+ factor(GarageC) 5 220424 2673025 1310.59
+ factor(GarageQ) 4 213737 2679712 1313.33
+ factor(CentralAir) 1 171099 2722350 1337.54
+ factor(BasementC) 4 116360 2777088 1382.31
+ factor(ExteriorC) 3 73758 2819691 1410.49
+ factor(LotConfig) 4 40148 2853301 1436.30
+ factor(Heating) 4 31951 2861498 1442.11
<none> 2893449 1456.74
Step: AIC=324.4
Price ~ factor(ExteriorQ)
Df Sum of Sq RSS Cp
+ factor(BasementHt) 4 250784 1035761 154.74
+ factor(GarageType) 6 213599 1072946 185.08
+ factor(Foundation) 5 127908 1158637 243.79
+ factor(KitchenQ) 3 111799 1174746 251.20
+ factor(GarageC) 5 61878 1224668 290.57
+ factor(CentralAir) 1 49963 1236582 291.01
+ factor(GarageQ) 4 51506 1235039 295.91
+ factor(Condition) 7 56831 1229715 298.14
+ factor(HouseStyle) 6 51999 1234546 299.56
+ factor(BasementFin) 6 50313 1236232 300.76
+ factor(HeatingQC) 3 34665 1251880 305.84
+ factor(LotConfig) 4 37324 1249222 305.96
+ factor(BasementC) 4 31824 1254721 309.86
+ factor(Heating) 4 14719 1271826 321.97
+ factor(ExteriorC) 3 9120 1277425 323.94
<none> 1286545 324.40
- factor(ExteriorQ) 3 1606904 2893449 1456.74
Step: AIC=154.74
Price ~ factor(ExteriorQ) + factor(BasementHt)
Df Sum of Sq RSS Cp
+ factor(GarageType) 6 124591 911170 78.481
+ factor(KitchenQ) 3 58853 976908 119.050
+ factor(GarageQ) 4 38749 997012 135.292
+ factor(HouseStyle) 6 43956 991805 135.604
+ factor(GarageC) 5 38839 996922 137.228
+ factor(CentralAir) 1 23287 1012474 140.246
+ factor(Foundation) 5 32902 1002859 141.434
+ factor(LotConfig) 4 24518 1011243 145.374
+ factor(HeatingQC) 3 19358 1016403 147.029
+ factor(Condition) 7 29957 1005804 147.521
<none> 1035761 154.742
+ factor(ExteriorC) 3 6387 1029374 156.217
+ factor(Heating) 4 9177 1026584 156.241
+ factor(BasementC) 3 5603 1030159 156.773
+ factor(BasementFin) 5 9762 1025999 157.827
- factor(BasementHt) 4 250784 1286545 324.400
- factor(ExteriorQ) 3 301775 1337536 362.522
Step: AIC=78.48
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType)
Df Sum of Sq RSS Cp
+ factor(KitchenQ) 3 46077 865093 51.839
+ factor(HouseStyle) 6 33520 877650 66.735
+ factor(Condition) 7 30314 880856 71.006
+ factor(LotConfig) 4 17289 893881 74.233
+ factor(GarageQ) 3 13232 897938 75.107
+ factor(BasementC) 3 11289 899881 76.484
+ factor(CentralAir) 1 4507 906662 77.288
+ factor(HeatingQC) 3 9175 901995 77.981
+ factor(ExteriorC) 3 8888 902282 78.184
+ factor(Foundation) 5 14356 896814 78.311
<none> 911170 78.481
+ factor(GarageC) 4 7043 904127 81.491
+ factor(Heating) 4 5425 905744 82.637
+ factor(BasementFin) 5 6296 904874 84.021
- factor(GarageType) 6 124591 1035761 154.742
- factor(BasementHt) 4 161777 1072946 185.085
- factor(ExteriorQ) 3 262738 1173908 258.607
Step: AIC=51.84
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType) +
factor(KitchenQ)
Df Sum of Sq RSS Cp
+ factor(HouseStyle) 6 29516 835576 42.930
+ factor(LotConfig) 4 16523 848569 48.134
+ factor(Condition) 7 20942 844151 51.004
+ factor(BasementC) 3 9422 855671 51.165
<none> 865093 51.839
+ factor(CentralAir) 1 2384 862708 52.150
+ factor(GarageQ) 3 7667 857426 52.408
+ factor(Foundation) 5 11629 853463 53.601
+ factor(ExteriorC) 3 4888 860204 54.376
+ factor(GarageC) 4 7585 857508 54.466
+ factor(HeatingQC) 3 2954 862139 55.747
+ factor(Heating) 4 3609 861484 57.283
+ factor(BasementFin) 5 4771 860322 58.459
- factor(KitchenQ) 3 46077 911170 78.481
- factor(GarageType) 6 111815 976908 119.050
- factor(ExteriorQ) 3 111470 976562 124.805
- factor(BasementHt) 4 131934 997027 137.303
Step: AIC=42.93
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType) +
factor(KitchenQ) + factor(HouseStyle)
Df Sum of Sq RSS Cp
+ factor(LotConfig) 4 19231 816346 37.306
+ factor(BasementC) 3 10072 825504 41.795
+ factor(Condition) 7 21361 814215 41.797
+ factor(CentralAir) 1 3577 832000 42.396
<none> 835576 42.930
+ factor(GarageQ) 3 7821 827756 43.389
+ factor(GarageC) 4 7300 828276 45.758
+ factor(ExteriorC) 3 3909 831667 46.160
+ factor(BasementFin) 5 8833 826743 46.672
+ factor(Foundation) 5 8749 826827 46.732
+ factor(HeatingQC) 3 2709 832867 47.010
+ factor(Heating) 4 2844 832732 48.915
- factor(HouseStyle) 6 29516 865093 51.839
- factor(KitchenQ) 3 42073 877650 66.735
- factor(GarageType) 6 103598 939174 104.319
- factor(ExteriorQ) 3 103129 938705 109.987
- factor(BasementHt) 4 135911 971487 131.210
Step: AIC=37.31
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType) +
factor(KitchenQ) + factor(HouseStyle) + factor(LotConfig)
Df Sum of Sq RSS Cp
+ factor(Condition) 7 21085 795261 36.370
+ factor(BasementC) 3 9549 806796 36.542
+ factor(CentralAir) 1 3152 813193 37.073
<none> 816346 37.306
+ factor(GarageQ) 3 5790 810556 39.205
+ factor(BasementFin) 5 9685 806661 40.446
+ factor(ExteriorC) 3 3928 812418 40.524
+ factor(Foundation) 5 9138 807207 40.833
+ factor(GarageC) 4 6012 810334 41.048
+ factor(HeatingQC) 3 2690 813656 41.401
- factor(LotConfig) 4 19231 835576 42.930
+ factor(Heating) 4 2312 814033 43.669
- factor(HouseStyle) 6 32224 848569 48.134
- factor(KitchenQ) 3 41042 857388 60.381
- factor(GarageType) 6 98041 914387 94.760
- factor(ExteriorQ) 3 107192 923538 107.243
- factor(BasementHt) 4 128373 944719 120.247
Step: AIC=36.37
Price ~ factor(ExteriorQ) + factor(BasementHt) + factor(GarageType) +
factor(KitchenQ) + factor(HouseStyle) + factor(LotConfig) +
factor(Condition)
Df Sum of Sq RSS Cp
<none> 795261 36.370
+ factor(BasementC) 3 7759 787502 36.873
- factor(Condition) 7 21085 816346 37.306
+ factor(CentralAir) 1 825 794436 37.786
+ factor(GarageQ) 3 5350 789911 38.580
+ factor(Foundation) 5 10768 784493 38.742
+ factor(ExteriorC) 3 3198 792063 40.104
+ factor(BasementFin) 5 8249 787012 40.526
+ factor(HeatingQC) 3 1677 793584 41.182
+ factor(GarageC) 4 3844 791417 41.647
- factor(LotConfig) 4 18954 814215 41.797
+ factor(Heating) 4 2858 792403 42.345
- factor(HouseStyle) 6 32660 827921 47.506
- factor(KitchenQ) 3 33199 828460 53.889
- factor(GarageType) 6 98676 893937 94.273
- factor(ExteriorQ) 3 97091 892352 99.150
- factor(BasementHt) 4 126197 921458 117.769
Call:
lm(formula = Price ~ factor(ExteriorQ) + factor(BasementHt) +
factor(GarageType) + factor(KitchenQ) + factor(HouseStyle) +
factor(LotConfig) + factor(Condition), data = AmesTrain6a)
Coefficients:
(Intercept) factor(ExteriorQ)Fa factor(ExteriorQ)Gd factor(ExteriorQ)TA
371.587 -127.471 -64.446 -97.138
factor(BasementHt)Fa factor(BasementHt)Gd factor(BasementHt)None factor(BasementHt)TA
-101.217 -61.932 -93.392 -80.750
factor(GarageType)Attchd factor(GarageType)Basment factor(GarageType)BuiltIn factor(GarageType)CarPort
-15.095 -33.555 -7.098 -67.326
factor(GarageType)Detchd factor(GarageType)None factor(KitchenQ)Fa factor(KitchenQ)Gd
-46.666 -47.290 -55.587 -34.705
factor(KitchenQ)TA factor(HouseStyle)1.5Unf factor(HouseStyle)1Story factor(HouseStyle)2.5Unf
-43.160 -12.327 -2.912 32.865
factor(HouseStyle)2Story factor(HouseStyle)SFoyer factor(HouseStyle)SLvl factor(LotConfig)CulDSac
9.627 -28.005 -8.037 9.395
factor(LotConfig)FR2 factor(LotConfig)FR3 factor(LotConfig)Inside factor(Condition)3
-20.856 12.616 -7.700 -6.741
factor(Condition)4 factor(Condition)5 factor(Condition)6 factor(Condition)7
7.656 22.628 20.582 24.883
factor(Condition)8 factor(Condition)9
32.250 41.650
modTransformNumeric=lm(Price~LotArea+I(LotArea^2)+YearBuilt+I(YearBuilt^2)+BasementSF+I(BasementSF^2)+GarageCars+I(GarageCars^2)+WoodDeckSF+I(WoodDeckSF^2)+GroundSF+I(GroundSF^2)+FullBath+TotalRooms, data=AmesTrain6a)
summary(modTransformNumeric)
Call:
lm(formula = Price ~ LotArea + I(LotArea^2) + YearBuilt + I(YearBuilt^2) +
BasementSF + I(BasementSF^2) + GarageCars + I(GarageCars^2) +
WoodDeckSF + I(WoodDeckSF^2) + GroundSF + I(GroundSF^2) +
FullBath + TotalRooms, data = AmesTrain6a)
Residuals:
Min 1Q Median 3Q Max
-119.888 -17.804 -0.971 14.951 140.922
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.871e+04 5.613e+03 5.114 4.29e-07 ***
LotArea 4.700e-03 1.016e-03 4.626 4.59e-06 ***
I(LotArea^2) -1.361e-07 3.983e-08 -3.418 0.000675 ***
YearBuilt -3.001e+01 5.740e+00 -5.229 2.38e-07 ***
I(YearBuilt^2) 7.858e-03 1.468e-03 5.355 1.24e-07 ***
BasementSF -1.685e-03 1.199e-02 -0.140 0.888336
I(BasementSF^2) 1.845e-05 5.519e-06 3.343 0.000881 ***
GarageCars -9.694e+00 6.511e+00 -1.489 0.137091
I(GarageCars^2) 5.723e+00 1.974e+00 2.898 0.003893 **
WoodDeckSF 1.197e-01 2.794e-02 4.282 2.17e-05 ***
I(WoodDeckSF^2) -2.188e-04 7.421e-05 -2.949 0.003320 **
GroundSF 3.191e-02 1.748e-02 1.826 0.068412 .
I(GroundSF^2) 1.557e-05 4.724e-06 3.296 0.001041 **
FullBath -1.051e+01 3.621e+00 -2.901 0.003859 **
TotalRooms -5.953e+00 1.600e+00 -3.721 0.000218 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 31.81 on 580 degrees of freedom
Multiple R-squared: 0.7972, Adjusted R-squared: 0.7923
F-statistic: 162.8 on 14 and 580 DF, p-value: < 2.2e-16
MSE=(summary(modTransformNumeric)$sigma)^2
step(none,scope=list(upper=modTransformNumeric),scale=MSE)
Start: AIC=2266.82
Price ~ 1
Df Sum of Sq RSS Cp
+ GroundSF 1 1427972 1465476 857.44
+ I(GroundSF^2) 1 1394445 1499004 890.58
+ I(GarageCars^2) 1 1276249 1617200 1007.40
+ GarageCars 1 1118553 1774895 1163.27
+ I(BasementSF^2) 1 1109930 1783519 1171.79
+ BasementSF 1 1010777 1882672 1269.79
+ I(YearBuilt^2) 1 1005760 1887689 1274.75
+ YearBuilt 1 996521 1896928 1283.88
+ FullBath 1 910528 1982921 1368.87
+ TotalRooms 1 613129 2280320 1662.82
+ WoodDeckSF 1 350672 2542776 1922.22
+ LotArea 1 291166 2602283 1981.04
+ I(WoodDeckSF^2) 1 185418 2708031 2085.56
+ I(LotArea^2) 1 169321 2724127 2101.47
<none> 2893449 2266.82
Step: AIC=857.44
Price ~ GroundSF
Df Sum of Sq RSS Cp
+ I(YearBuilt^2) 1 532682 932795 332.95
+ YearBuilt 1 530339 935137 335.27
+ I(BasementSF^2) 1 391579 1073898 472.42
+ BasementSF 1 377033 1088444 486.79
+ I(GarageCars^2) 1 339234 1126243 524.15
+ GarageCars 1 285885 1179592 576.88
+ WoodDeckSF 1 113090 1352386 747.67
+ TotalRooms 1 90961 1374516 769.54
+ FullBath 1 57487 1407989 802.62
+ I(WoodDeckSF^2) 1 35416 1430060 824.44
+ LotArea 1 16082 1449394 843.55
+ I(LotArea^2) 1 2964 1462513 856.51
+ I(GroundSF^2) 1 2689 1462787 856.79
<none> 1465476 857.44
- GroundSF 1 1427972 2893449 2266.82
Step: AIC=332.95
Price ~ GroundSF + I(YearBuilt^2)
Df Sum of Sq RSS Cp
+ I(BasementSF^2) 1 183645 749150 153.44
+ BasementSF 1 170574 762221 166.36
+ I(GarageCars^2) 1 89747 843047 246.25
+ GarageCars 1 48072 884723 287.44
+ WoodDeckSF 1 43893 888902 291.57
+ YearBuilt 1 33500 899295 301.84
+ LotArea 1 30861 901934 304.45
+ I(WoodDeckSF^2) 1 23620 909175 311.61
+ TotalRooms 1 22576 910219 312.64
+ I(GroundSF^2) 1 13999 918796 321.12
+ I(LotArea^2) 1 12374 920421 322.72
+ FullBath 1 9402 923393 325.66
<none> 932795 332.95
- I(YearBuilt^2) 1 532682 1465476 857.44
- GroundSF 1 954894 1887689 1274.75
Step: AIC=153.44
Price ~ GroundSF + I(YearBuilt^2) + I(BasementSF^2)
Df Sum of Sq RSS Cp
+ I(GarageCars^2) 1 40765 708385 115.15
+ YearBuilt 1 35776 713374 120.08
+ WoodDeckSF 1 25372 723778 130.37
+ GarageCars 1 22075 727075 133.62
+ I(GroundSF^2) 1 19629 729521 136.04
+ TotalRooms 1 13510 735640 142.09
+ LotArea 1 10237 738913 145.33
+ FullBath 1 8302 740849 147.24
+ I(WoodDeckSF^2) 1 8291 740859 147.25
+ I(LotArea^2) 1 3740 745410 151.75
<none> 749150 153.44
+ BasementSF 1 361 748789 155.09
- I(BasementSF^2) 1 183645 932795 332.95
- I(YearBuilt^2) 1 324748 1073898 472.42
- GroundSF 1 590645 1339795 735.22
Step: AIC=115.15
Price ~ GroundSF + I(YearBuilt^2) + I(BasementSF^2) + I(GarageCars^2)
Df Sum of Sq RSS Cp
+ WoodDeckSF 1 21590 686794 95.812
+ YearBuilt 1 21252 687133 96.147
+ I(GroundSF^2) 1 16503 691882 100.840
+ TotalRooms 1 14971 693414 102.355
+ FullBath 1 10282 698103 106.989
+ GarageCars 1 9688 698697 107.577
+ LotArea 1 8453 699932 108.797
+ I(WoodDeckSF^2) 1 7416 700969 109.822
+ I(LotArea^2) 1 2759 705626 114.425
<none> 708385 115.152
+ BasementSF 1 900 707484 116.262
- I(GarageCars^2) 1 40765 749150 153.443
- I(BasementSF^2) 1 134663 843047 246.249
- I(YearBuilt^2) 1 209182 917567 319.902
- GroundSF 1 380195 1088580 488.928
Step: AIC=95.81
Price ~ GroundSF + I(YearBuilt^2) + I(BasementSF^2) + I(GarageCars^2) +
WoodDeckSF
Df Sum of Sq RSS Cp
+ YearBuilt 1 21448 665346 76.613
+ I(GroundSF^2) 1 14411 672384 83.569
+ TotalRooms 1 13645 673149 84.326
+ I(WoodDeckSF^2) 1 11895 674899 86.055
+ GarageCars 1 9410 677385 88.512
+ FullBath 1 9346 677449 88.575
+ LotArea 1 7069 679725 90.826
+ I(LotArea^2) 1 2216 684578 95.622
<none> 686794 95.812
+ BasementSF 1 951 685843 96.872
- WoodDeckSF 1 21590 708385 115.152
- I(GarageCars^2) 1 36984 723778 130.366
- I(BasementSF^2) 1 122584 809379 214.972
- I(YearBuilt^2) 1 192934 879729 284.504
- GroundSF 1 361179 1047973 450.794
Step: AIC=76.61
Price ~ GroundSF + I(YearBuilt^2) + I(BasementSF^2) + I(GarageCars^2) +
WoodDeckSF + YearBuilt
Df Sum of Sq RSS Cp
+ I(GroundSF^2) 1 19727 645619 59.116
+ FullBath 1 18008 647338 60.815
+ TotalRooms 1 13928 651419 64.848
+ LotArea 1 11641 653705 67.108
+ I(WoodDeckSF^2) 1 7143 658204 71.554
+ GarageCars 1 5119 660227 73.554
+ I(LotArea^2) 1 4682 660664 73.985
<none> 665346 76.613
+ BasementSF 1 886 664460 77.737
- YearBuilt 1 21448 686794 95.812
- WoodDeckSF 1 21787 687133 96.147
- I(YearBuilt^2) 1 22682 688029 97.032
- I(GarageCars^2) 1 23268 688615 97.611
- I(BasementSF^2) 1 129396 794742 202.506
- GroundSF 1 305148 970494 376.215
Step: AIC=59.12
Price ~ GroundSF + I(YearBuilt^2) + I(BasementSF^2) + I(GarageCars^2) +
WoodDeckSF + YearBuilt + I(GroundSF^2)
Df Sum of Sq RSS Cp
+ LotArea 1 14148 631471 47.132
+ FullBath 1 11777 633843 49.476
+ TotalRooms 1 10290 635329 50.945
+ I(WoodDeckSF^2) 1 8145 637474 53.065
+ I(LotArea^2) 1 5908 639711 55.276
- GroundSF 1 6 645625 57.121
<none> 645619 59.116
+ GarageCars 1 1400 644219 59.732
+ BasementSF 1 621 644998 60.502
- WoodDeckSF 1 19353 664973 76.244
- I(GarageCars^2) 1 19622 665241 76.509
- I(GroundSF^2) 1 19727 665346 76.613
- YearBuilt 1 26765 672384 83.569
- I(YearBuilt^2) 1 28164 673783 84.953
- I(BasementSF^2) 1 136647 782266 192.174
Step: AIC=47.13
Price ~ GroundSF + I(YearBuilt^2) + I(BasementSF^2) + I(GarageCars^2) +
WoodDeckSF + YearBuilt + I(GroundSF^2) + LotArea
Df Sum of Sq RSS Cp
+ FullBath 1 11468 620002 37.797
+ TotalRooms 1 10759 620711 38.497
+ I(WoodDeckSF^2) 1 9942 621528 39.305
+ I(LotArea^2) 1 8913 622558 40.322
- GroundSF 1 368 631839 45.496
<none> 631471 47.132
+ GarageCars 1 1735 629735 47.416
+ BasementSF 1 883 630588 48.259
- LotArea 1 14148 645619 59.116
- I(GarageCars^2) 1 16765 648236 61.702
- WoodDeckSF 1 17392 648863 62.322
- I(GroundSF^2) 1 22235 653705 67.108
- YearBuilt 1 32676 664147 77.428
- I(YearBuilt^2) 1 34242 665713 78.976
- I(BasementSF^2) 1 121959 753429 165.673
Step: AIC=37.8
Price ~ GroundSF + I(YearBuilt^2) + I(BasementSF^2) + I(GarageCars^2) +
WoodDeckSF + YearBuilt + I(GroundSF^2) + LotArea + FullBath
Df Sum of Sq RSS Cp
+ I(WoodDeckSF^2) 1 9818 610184 30.093
+ I(LotArea^2) 1 9362 610640 30.543
+ TotalRooms 1 8165 611837 31.726
- GroundSF 1 281 620283 36.074
<none> 620002 37.797
+ GarageCars 1 1773 618229 38.044
+ BasementSF 1 601 619402 39.203
- FullBath 1 11468 631471 47.132
- LotArea 1 13840 633843 49.476
- I(GroundSF^2) 1 15599 635602 51.215
- WoodDeckSF 1 16815 636817 52.416
- I(GarageCars^2) 1 17166 637169 52.763
- YearBuilt 1 39547 659549 74.884
- I(YearBuilt^2) 1 41333 661335 76.649
- I(BasementSF^2) 1 120447 740449 154.844
Step: AIC=30.09
Price ~ GroundSF + I(YearBuilt^2) + I(BasementSF^2) + I(GarageCars^2) +
WoodDeckSF + YearBuilt + I(GroundSF^2) + LotArea + FullBath +
I(WoodDeckSF^2)
Df Sum of Sq RSS Cp
+ TotalRooms 1 8914 601270 23.282
+ I(LotArea^2) 1 7997 602188 24.189
- GroundSF 1 203 610388 28.294
<none> 610184 30.093
+ GarageCars 1 1555 608629 30.555
+ BasementSF 1 271 609914 31.825
- I(WoodDeckSF^2) 1 9818 620002 37.797
- FullBath 1 11344 621528 39.305
- LotArea 1 15609 625793 43.520
- I(GarageCars^2) 1 16074 626258 43.980
- I(GroundSF^2) 1 16735 626919 44.633
- WoodDeckSF 1 20936 631120 48.785
- YearBuilt 1 32824 643008 60.535
- I(YearBuilt^2) 1 34342 644526 62.036
- I(BasementSF^2) 1 125712 735897 152.344
Step: AIC=23.28
Price ~ GroundSF + I(YearBuilt^2) + I(BasementSF^2) + I(GarageCars^2) +
WoodDeckSF + YearBuilt + I(GroundSF^2) + LotArea + FullBath +
I(WoodDeckSF^2) + TotalRooms
Df Sum of Sq RSS Cp
+ I(LotArea^2) 1 12165 589105 13.259
+ GarageCars 1 2384 598886 22.926
- GroundSF 1 1920 603191 23.180
<none> 601270 23.282
+ BasementSF 1 188 601082 25.097
- FullBath 1 8659 609930 29.841
- TotalRooms 1 8914 610184 30.093
- I(WoodDeckSF^2) 1 10567 611837 31.726
- I(GroundSF^2) 1 14417 615687 35.531
- LotArea 1 16168 617439 37.263
- I(GarageCars^2) 1 16908 618178 37.993
- WoodDeckSF 1 21577 622847 42.608
- YearBuilt 1 31490 632761 52.407
- I(YearBuilt^2) 1 32894 634164 53.794
- I(BasementSF^2) 1 119088 720358 138.986
Step: AIC=13.26
Price ~ GroundSF + I(YearBuilt^2) + I(BasementSF^2) + I(GarageCars^2) +
WoodDeckSF + YearBuilt + I(GroundSF^2) + LotArea + FullBath +
I(WoodDeckSF^2) + TotalRooms + I(LotArea^2)
Df Sum of Sq RSS Cp
+ GarageCars 1 2265 586840 13.020
<none> 589105 13.259
- GroundSF 1 2281 591386 13.513
+ BasementSF 1 43 589063 15.216
- FullBath 1 8572 597677 19.731
- I(WoodDeckSF^2) 1 8989 598094 20.143
- I(LotArea^2) 1 12165 601270 23.282
- TotalRooms 1 13082 602188 24.189
- I(GroundSF^2) 1 14593 603698 25.681
- I(GarageCars^2) 1 17119 606225 28.179
- WoodDeckSF 1 18817 607922 29.856
- LotArea 1 21975 611080 32.978
- YearBuilt 1 32298 621403 43.181
- I(YearBuilt^2) 1 33713 622818 44.580
- I(BasementSF^2) 1 105176 694281 115.212
Step: AIC=13.02
Price ~ GroundSF + I(YearBuilt^2) + I(BasementSF^2) + I(GarageCars^2) +
WoodDeckSF + YearBuilt + I(GroundSF^2) + LotArea + FullBath +
I(WoodDeckSF^2) + TotalRooms + I(LotArea^2) + GarageCars
Df Sum of Sq RSS Cp
<none> 586840 13.020
- GarageCars 1 2265 589105 13.259
- GroundSF 1 3385 590225 14.365
+ BasementSF 1 20 586820 15.000
- FullBath 1 8496 595336 19.417
- I(GarageCars^2) 1 8580 595421 19.500
- I(WoodDeckSF^2) 1 8783 595623 19.700
- I(GroundSF^2) 1 10972 597812 21.864
- I(LotArea^2) 1 12046 598886 22.926
- TotalRooms 1 13997 600837 24.854
- WoodDeckSF 1 18542 605382 29.346
- LotArea 1 22005 608845 32.769
- YearBuilt 1 27649 614489 38.347
- I(YearBuilt^2) 1 28994 615834 39.677
- I(BasementSF^2) 1 95658 682498 105.566
Call:
lm(formula = Price ~ GroundSF + I(YearBuilt^2) + I(BasementSF^2) +
I(GarageCars^2) + WoodDeckSF + YearBuilt + I(GroundSF^2) +
LotArea + FullBath + I(WoodDeckSF^2) + TotalRooms + I(LotArea^2) +
GarageCars, data = AmesTrain6a)
Coefficients:
(Intercept) GroundSF I(YearBuilt^2) I(BasementSF^2) I(GarageCars^2) WoodDeckSF
2.869e+04 3.196e-02 7.855e-03 1.772e-05 5.739e+00 1.194e-01
YearBuilt I(GroundSF^2) LotArea FullBath I(WoodDeckSF^2) TotalRooms
-3.000e+01 1.555e-05 4.677e-03 -1.048e+01 -2.181e-04 -5.939e+00
I(LotArea^2) GarageCars
-1.351e-07 -9.734e+00
We chose to narrow down our pool of variables separately by categorical and numerical factors before we combined them. It was easier to analyze the numeric and categorical variables in models together. However, once we narrowed down the categorical and numerical variables seperately, we combined them in this model (modTransformFull) and re-ran stepwise, forward, and backward selection, which is below.
modTransformFull=lm(Price~LotArea+I(LotArea^2)+YearBuilt+I(YearBuilt^2)+BasementSF+I(BasementSF^2)+GarageCars+I(GarageCars^2)+WoodDeckSF+I(WoodDeckSF^2)+GroundSF+I(GroundSF^2)+FullBath+TotalRooms+factor(HouseStyle)+factor(ExteriorQ)+factor(BasementFin)+factor(KitchenQ)+factor(BasementHt)+factor(Condition), data=AmesTrain6a)
summary(modTransformFull)
Call:
lm(formula = Price ~ LotArea + I(LotArea^2) + YearBuilt + I(YearBuilt^2) +
BasementSF + I(BasementSF^2) + GarageCars + I(GarageCars^2) +
WoodDeckSF + I(WoodDeckSF^2) + GroundSF + I(GroundSF^2) +
FullBath + TotalRooms + factor(HouseStyle) + factor(ExteriorQ) +
factor(BasementFin) + factor(KitchenQ) + factor(BasementHt) +
factor(Condition), data = AmesTrain6a)
Residuals:
Min 1Q Median 3Q Max
-86.54 -12.42 -1.22 11.65 112.04
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.353e+03 5.703e+03 1.114 0.265747
LotArea 4.730e-03 7.867e-04 6.012 3.34e-09 ***
I(LotArea^2) -1.443e-07 3.053e-08 -4.727 2.89e-06 ***
YearBuilt -6.891e+00 5.857e+00 -1.176 0.239934
I(YearBuilt^2) 1.896e-03 1.504e-03 1.261 0.207799
BasementSF 9.301e-03 1.548e-02 0.601 0.548100
I(BasementSF^2) 2.151e-06 6.334e-06 0.340 0.734324
GarageCars -6.668e+00 4.962e+00 -1.344 0.179609
I(GarageCars^2) 4.167e+00 1.499e+00 2.779 0.005632 **
WoodDeckSF 3.723e-02 2.138e-02 1.742 0.082123 .
I(WoodDeckSF^2) -7.028e-05 5.609e-05 -1.253 0.210778
GroundSF 5.170e-02 1.458e-02 3.546 0.000424 ***
I(GroundSF^2) 8.937e-06 3.653e-06 2.447 0.014734 *
FullBath -7.836e+00 2.773e+00 -2.825 0.004892 **
TotalRooms -3.449e+00 1.250e+00 -2.760 0.005979 **
factor(HouseStyle)1.5Unf 2.978e+01 1.161e+01 2.565 0.010581 *
factor(HouseStyle)1Story 1.274e+01 4.706e+00 2.707 0.007007 **
factor(HouseStyle)2.5Unf 1.059e+01 1.468e+01 0.721 0.470962
factor(HouseStyle)2Story 4.084e+00 4.376e+00 0.933 0.351062
factor(HouseStyle)SFoyer 1.662e+00 7.493e+00 0.222 0.824604
factor(HouseStyle)SLvl 8.772e+00 5.608e+00 1.564 0.118315
factor(ExteriorQ)Fa -7.441e+01 1.354e+01 -5.497 5.92e-08 ***
factor(ExteriorQ)Gd -4.390e+01 8.287e+00 -5.297 1.70e-07 ***
factor(ExteriorQ)TA -6.023e+01 8.985e+00 -6.704 5.03e-11 ***
factor(BasementFin)BLQ 6.407e-01 4.208e+00 0.152 0.879041
factor(BasementFin)GLQ 3.279e+00 3.373e+00 0.972 0.331435
factor(BasementFin)LwQ -9.849e+00 5.290e+00 -1.862 0.063155 .
factor(BasementFin)None -5.252e+01 1.285e+01 -4.087 5.02e-05 ***
factor(BasementFin)Rec -6.342e-01 4.504e+00 -0.141 0.888090
factor(BasementFin)Unf -8.310e+00 3.269e+00 -2.542 0.011286 *
factor(KitchenQ)Fa -2.945e+01 8.783e+00 -3.353 0.000855 ***
factor(KitchenQ)Gd -2.078e+01 5.712e+00 -3.638 0.000301 ***
factor(KitchenQ)TA -2.097e+01 5.906e+00 -3.550 0.000418 ***
factor(BasementHt)Fa -4.652e+01 8.935e+00 -5.207 2.71e-07 ***
factor(BasementHt)Gd -3.494e+01 5.209e+00 -6.707 4.93e-11 ***
factor(BasementHt)None NA NA NA NA
factor(BasementHt)TA -4.414e+01 6.429e+00 -6.866 1.79e-11 ***
factor(Condition)3 3.319e+00 2.715e+01 0.122 0.902732
factor(Condition)4 1.302e+01 2.654e+01 0.491 0.623871
factor(Condition)5 2.482e+01 2.631e+01 0.943 0.346075
factor(Condition)6 3.357e+01 2.636e+01 1.274 0.203352
factor(Condition)7 4.118e+01 2.636e+01 1.562 0.118857
factor(Condition)8 4.995e+01 2.643e+01 1.890 0.059297 .
factor(Condition)9 6.333e+01 2.722e+01 2.326 0.020363 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 23.41 on 552 degrees of freedom
Multiple R-squared: 0.8954, Adjusted R-squared: 0.8875
F-statistic: 112.5 on 42 and 552 DF, p-value: < 2.2e-16
We made this model by combining the categorical and numerical variables suggested by our separate stepwise selections.
Stepwise selection:
MSE=(summary(modTransformFull)$sigma)^2
step(none,scope=list(upper=modTransformFull),scale=MSE)
Start: AIC=4685.16
Price ~ 1
Df Sum of Sq RSS Cp
+ factor(ExteriorQ) 3 1606904 1286545 1759.9
+ factor(BasementHt) 4 1555913 1337536 1854.9
+ GroundSF 1 1427972 1465476 2082.3
+ I(GroundSF^2) 1 1394445 1499004 2143.4
+ factor(KitchenQ) 3 1305178 1588271 2310.3
+ I(GarageCars^2) 1 1276249 1617200 2359.1
+ GarageCars 1 1118553 1774895 2646.7
+ I(BasementSF^2) 1 1109930 1783519 2662.5
+ BasementSF 1 1010777 1882672 2843.3
+ I(YearBuilt^2) 1 1005760 1887689 2852.5
+ YearBuilt 1 996521 1896928 2869.3
+ FullBath 1 910528 1982921 3026.2
+ TotalRooms 1 613129 2280320 3568.7
+ factor(Condition) 7 546078 2347371 3703.0
+ factor(BasementFin) 6 515220 2378228 3757.3
+ WoodDeckSF 1 350672 2542776 4047.5
+ LotArea 1 291166 2602283 4156.0
+ factor(HouseStyle) 6 230783 2662666 4276.2
+ I(WoodDeckSF^2) 1 185418 2708031 4348.9
+ I(LotArea^2) 1 169321 2724127 4378.3
<none> 2893449 4685.2
Step: AIC=1759.89
Price ~ factor(ExteriorQ)
Df Sum of Sq RSS Cp
+ I(GroundSF^2) 1 559835 726710 740.65
+ GroundSF 1 551843 734702 755.23
+ I(GarageCars^2) 1 281632 1004913 1248.14
+ I(BasementSF^2) 1 265658 1020887 1277.28
+ BasementSF 1 253821 1032724 1298.87
+ TotalRooms 1 250074 1036472 1305.71
+ factor(BasementHt) 4 250784 1035761 1310.41
+ GarageCars 1 238478 1048067 1326.86
+ FullBath 1 202387 1084158 1392.70
+ LotArea 1 169637 1116908 1452.44
+ I(YearBuilt^2) 1 148483 1138062 1491.03
+ YearBuilt 1 147695 1138850 1492.46
+ factor(KitchenQ) 3 111799 1174746 1561.94
+ I(LotArea^2) 1 100624 1185921 1578.33
+ WoodDeckSF 1 86278 1200267 1604.50
+ I(WoodDeckSF^2) 1 71648 1214897 1631.19
+ factor(Condition) 7 56831 1229715 1670.22
+ factor(HouseStyle) 6 51999 1234546 1677.03
+ factor(BasementFin) 6 50313 1236232 1680.11
<none> 1286545 1759.89
- factor(ExteriorQ) 3 1606904 2893449 4685.16
Step: AIC=740.65
Price ~ factor(ExteriorQ) + I(GroundSF^2)
Df Sum of Sq RSS Cp
+ factor(BasementHt) 4 160720 565990 455.47
+ I(YearBuilt^2) 1 157397 569313 455.53
+ YearBuilt 1 157382 569329 455.56
+ BasementSF 1 121924 604786 520.24
+ I(BasementSF^2) 1 112159 614551 538.05
+ factor(HouseStyle) 6 113570 613140 545.48
+ factor(BasementFin) 6 92363 634347 584.16
+ I(GarageCars^2) 1 66215 660495 621.86
+ GarageCars 1 63555 663155 626.71
+ factor(KitchenQ) 3 56948 669762 642.77
+ factor(Condition) 7 53872 672838 656.38
+ WoodDeckSF 1 33869 692841 680.86
+ LotArea 1 30692 696018 686.66
+ I(WoodDeckSF^2) 1 16874 709836 711.87
+ TotalRooms 1 13037 713673 718.87
+ I(LotArea^2) 1 11312 715398 722.01
+ FullBath 1 6776 719934 730.29
+ GroundSF 1 3391 723319 736.46
<none> 726710 740.65
- I(GroundSF^2) 1 559835 1286545 1759.89
- factor(ExteriorQ) 3 772294 1499004 2143.45
Step: AIC=455.47
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt)
Df Sum of Sq RSS Cp
+ factor(HouseStyle) 6 84078 481912 314.09
+ BasementSF 1 74839 491150 320.95
+ YearBuilt 1 66808 499182 335.60
+ I(YearBuilt^2) 1 66606 499384 335.96
+ I(BasementSF^2) 1 65947 500043 337.17
+ GarageCars 1 37702 528288 388.69
+ I(GarageCars^2) 1 37418 528572 389.21
+ factor(BasementFin) 5 39757 526233 392.94
+ LotArea 1 34700 531290 394.17
+ factor(Condition) 7 37467 528522 401.12
+ factor(KitchenQ) 3 29273 536717 408.07
+ WoodDeckSF 1 14324 551666 431.34
+ I(LotArea^2) 1 13501 552489 432.84
+ I(WoodDeckSF^2) 1 10439 555551 438.42
+ TotalRooms 1 5224 560766 447.94
+ GroundSF 1 3983 562007 450.20
<none> 565990 455.47
+ FullBath 1 200 565790 457.10
- factor(BasementHt) 4 160720 726710 740.65
- factor(ExteriorQ) 3 169432 735422 758.54
- I(GroundSF^2) 1 469771 1035761 1310.41
Step: AIC=314.09
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle)
Df Sum of Sq RSS Cp
+ factor(Condition) 7 38427 443485 257.99
+ YearBuilt 1 30128 451783 261.13
+ I(YearBuilt^2) 1 30095 451816 261.19
+ factor(BasementFin) 5 26815 455097 275.18
+ factor(KitchenQ) 3 24506 457406 275.39
+ GarageCars 1 20826 461085 278.10
+ I(GarageCars^2) 1 20337 461574 278.99
+ BasementSF 1 19221 462690 281.03
+ I(BasementSF^2) 1 13778 468134 290.96
+ GroundSF 1 12741 469171 292.85
+ LotArea 1 12573 469338 293.16
+ WoodDeckSF 1 8474 473438 300.63
+ I(WoodDeckSF^2) 1 4728 477184 307.47
+ I(LotArea^2) 1 2753 479159 311.07
<none> 481912 314.09
+ TotalRooms 1 1025 480887 314.22
+ FullBath 1 125 481787 315.86
- factor(HouseStyle) 6 84078 565990 455.47
- factor(BasementHt) 4 131228 613140 545.48
- factor(ExteriorQ) 3 136831 618743 557.70
- I(GroundSF^2) 1 509894 991805 1242.23
Step: AIC=257.99
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition)
Df Sum of Sq RSS Cp
+ I(YearBuilt^2) 1 47529 395956 173.29
+ YearBuilt 1 47420 396065 173.49
+ BasementSF 1 25980 417505 212.60
+ I(GarageCars^2) 1 21214 422271 221.30
+ GarageCars 1 21029 422456 221.63
+ I(BasementSF^2) 1 19924 423561 223.65
+ factor(BasementFin) 5 20154 423331 231.23
+ factor(KitchenQ) 3 16659 426826 233.61
+ LotArea 1 14369 429116 233.78
+ GroundSF 1 13497 429988 235.37
+ WoodDeckSF 1 5209 438276 250.49
+ I(LotArea^2) 1 3787 439697 253.09
+ I(WoodDeckSF^2) 1 2155 441329 256.06
<none> 443485 257.99
+ TotalRooms 1 437 443048 259.20
+ FullBath 1 97 443388 259.82
- factor(Condition) 7 38427 481912 314.09
- factor(HouseStyle) 6 85038 528522 401.12
- factor(ExteriorQ) 3 119422 562907 469.84
- factor(BasementHt) 4 127941 571426 483.38
- I(GroundSF^2) 1 521124 964609 1206.62
Step: AIC=173.29
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2)
Df Sum of Sq RSS Cp
+ BasementSF 1 19141 376814 140.38
+ LotArea 1 17163 378793 143.99
+ I(BasementSF^2) 1 15213 380743 147.54
+ I(GarageCars^2) 1 14269 381686 149.26
+ GarageCars 1 11136 384819 154.98
+ GroundSF 1 10966 384990 155.29
+ factor(KitchenQ) 3 11718 384237 157.92
+ factor(BasementFin) 5 13149 382807 159.31
+ I(LotArea^2) 1 6413 389543 163.59
+ WoodDeckSF 1 5133 390822 165.93
+ I(WoodDeckSF^2) 1 2604 393352 170.54
+ FullBath 1 1859 394097 171.90
<none> 395956 173.29
+ TotalRooms 1 648 395307 174.11
+ YearBuilt 1 403 395553 174.56
- factor(HouseStyle) 6 48883 444838 250.46
- I(YearBuilt^2) 1 47529 443485 257.99
- factor(Condition) 7 55861 451816 261.19
- factor(BasementHt) 4 62522 458478 279.34
- factor(ExteriorQ) 3 91958 487914 335.04
- I(GroundSF^2) 1 520628 916584 1121.01
Step: AIC=140.38
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF
Df Sum of Sq RSS Cp
+ LotArea 1 15952 360862 113.28
+ I(GarageCars^2) 1 9570 367245 124.92
+ factor(KitchenQ) 3 11622 365192 125.18
+ GarageCars 1 7546 369268 128.61
+ factor(BasementFin) 5 11477 365337 129.44
+ I(LotArea^2) 1 6477 370337 130.56
+ WoodDeckSF 1 3870 372945 135.32
+ FullBath 1 3450 373364 136.08
+ GroundSF 1 3147 373667 136.63
+ I(WoodDeckSF^2) 1 1520 375294 139.60
+ TotalRooms 1 1403 375411 139.82
<none> 376814 140.38
+ I(BasementSF^2) 1 579 376236 141.32
+ YearBuilt 1 325 376490 141.78
- factor(HouseStyle) 6 7915 384730 142.81
- BasementSF 1 19141 395956 173.29
- I(YearBuilt^2) 1 40691 417505 212.60
- factor(BasementHt) 4 44791 421605 214.08
- factor(Condition) 7 62009 438823 239.49
- factor(ExteriorQ) 3 79015 455829 278.51
- I(GroundSF^2) 1 224483 601297 547.87
Step: AIC=113.28
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea
Df Sum of Sq RSS Cp
+ I(LotArea^2) 1 10708 350155 95.744
+ factor(KitchenQ) 3 12063 348800 97.272
+ I(GarageCars^2) 1 7903 352959 100.860
+ factor(BasementFin) 5 11063 349799 103.096
+ GarageCars 1 5588 355274 105.083
+ FullBath 1 3607 357255 108.696
- factor(HouseStyle) 6 4395 365257 109.293
+ WoodDeckSF 1 3146 357716 109.538
+ TotalRooms 1 2411 358452 110.879
<none> 360862 113.277
+ GroundSF 1 903 359960 113.630
+ I(WoodDeckSF^2) 1 871 359991 113.688
+ YearBuilt 1 593 360269 114.195
+ I(BasementSF^2) 1 354 360508 114.631
- LotArea 1 15952 376814 140.376
- BasementSF 1 17931 378793 143.986
- factor(BasementHt) 4 43851 404713 185.268
- I(YearBuilt^2) 1 43356 404218 190.365
- factor(Condition) 7 62571 423434 213.418
- factor(ExteriorQ) 3 81904 442767 256.684
- I(GroundSF^2) 1 167305 528168 416.471
Step: AIC=95.74
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2)
Df Sum of Sq RSS Cp
+ factor(KitchenQ) 3 11542 338612 80.689
+ factor(BasementFin) 5 12613 337542 82.736
+ I(GarageCars^2) 1 7747 342408 83.612
+ GarageCars 1 5656 344499 87.427
+ TotalRooms 1 4897 345258 88.811
+ FullBath 1 4220 345935 90.046
- factor(HouseStyle) 6 4599 354753 92.133
+ WoodDeckSF 1 2703 347451 92.812
<none> 350155 95.744
+ I(WoodDeckSF^2) 1 896 349258 96.109
+ GroundSF 1 697 349458 96.473
+ YearBuilt 1 373 349782 97.064
+ I(BasementSF^2) 1 4 350151 97.737
- I(LotArea^2) 1 10708 360862 113.277
- BasementSF 1 15192 365346 121.457
- LotArea 1 20183 370337 130.561
- I(YearBuilt^2) 1 39181 389336 165.217
- factor(BasementHt) 4 46395 396549 172.376
- factor(Condition) 7 61003 411157 193.024
- factor(ExteriorQ) 3 82504 432659 240.246
- I(GroundSF^2) 1 163484 513639 391.968
Step: AIC=80.69
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ)
Df Sum of Sq RSS Cp
+ I(GarageCars^2) 1 7655 330957 68.724
+ factor(BasementFin) 5 11288 327324 70.097
+ GarageCars 1 5848 332764 72.021
+ TotalRooms 1 4510 334102 74.462
+ FullBath 1 4214 334399 75.002
- factor(HouseStyle) 6 4119 342731 76.203
+ WoodDeckSF 1 2159 336454 78.751
<none> 338612 80.689
+ GroundSF 1 833 337779 81.169
+ I(WoodDeckSF^2) 1 823 337789 81.188
+ YearBuilt 1 107 338505 82.494
+ I(BasementSF^2) 1 4 338609 82.682
- factor(KitchenQ) 3 11542 350155 95.744
- I(LotArea^2) 1 10187 348800 97.272
- BasementSF 1 15072 353684 106.183
- LotArea 1 19681 358293 114.591
- I(YearBuilt^2) 1 35073 373685 142.668
- factor(BasementHt) 4 39626 378239 144.974
- factor(ExteriorQ) 3 41745 380357 150.838
- factor(Condition) 7 51298 389911 160.266
- I(GroundSF^2) 1 155237 493849 361.868
Step: AIC=68.72
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2)
Df Sum of Sq RSS Cp
+ factor(BasementFin) 5 13552 317405 54.002
+ TotalRooms 1 5177 325780 61.281
+ FullBath 1 4755 326202 62.051
- factor(HouseStyle) 6 4240 335197 64.459
+ WoodDeckSF 1 1838 329119 67.371
<none> 330957 68.724
+ GroundSF 1 919 330038 69.048
+ I(WoodDeckSF^2) 1 730 330227 69.392
+ GarageCars 1 139 330818 70.470
+ I(BasementSF^2) 1 42 330915 70.647
+ YearBuilt 1 7 330950 70.711
- I(GarageCars^2) 1 7655 338612 80.689
- factor(KitchenQ) 3 11451 342408 83.612
- I(LotArea^2) 1 10064 341021 85.082
- BasementSF 1 11355 342312 87.438
- LotArea 1 18895 349853 101.193
- I(YearBuilt^2) 1 31854 362812 124.833
- factor(ExteriorQ) 3 36151 367108 128.670
- factor(BasementHt) 4 38327 369284 130.640
- factor(Condition) 7 51993 382950 149.569
- I(GroundSF^2) 1 136711 467669 316.110
Step: AIC=54
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin)
Df Sum of Sq RSS Cp
- factor(HouseStyle) 6 4662 322066 50.506
+ FullBath 1 3012 314393 50.508
+ TotalRooms 1 2505 314899 51.432
+ GroundSF 1 1837 315568 52.652
+ WoodDeckSF 1 1111 316293 53.975
<none> 317405 54.002
+ YearBuilt 1 486 316919 55.116
+ GarageCars 1 443 316962 55.194
+ I(WoodDeckSF^2) 1 297 317108 55.461
+ I(BasementSF^2) 1 111 317293 55.799
- factor(KitchenQ) 3 10333 327738 66.853
- factor(BasementFin) 5 13552 330957 68.724
- BasementSF 1 9243 326648 68.863
- I(GarageCars^2) 1 9919 327324 70.097
- I(LotArea^2) 1 11150 328555 72.343
- LotArea 1 19862 337266 88.234
- I(YearBuilt^2) 1 26724 344129 100.752
- factor(BasementHt) 3 34121 351526 110.245
- factor(ExteriorQ) 3 35972 353377 113.622
- factor(Condition) 7 44738 362142 121.612
- I(GroundSF^2) 1 139309 456714 306.127
Step: AIC=50.51
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(Condition) + I(YearBuilt^2) + BasementSF + LotArea +
I(LotArea^2) + factor(KitchenQ) + I(GarageCars^2) + factor(BasementFin)
Df Sum of Sq RSS Cp
+ TotalRooms 1 2875 319191 47.261
+ FullBath 1 2819 319247 47.363
+ WoodDeckSF 1 1114 320952 50.473
<none> 322066 50.506
+ GroundSF 1 689 321377 51.249
+ YearBuilt 1 391 321675 51.792
+ I(WoodDeckSF^2) 1 315 321751 51.931
+ GarageCars 1 259 321807 52.034
+ I(BasementSF^2) 1 59 322007 52.398
+ factor(HouseStyle) 6 4662 317405 54.002
- factor(KitchenQ) 3 10432 332498 63.535
- factor(BasementFin) 5 13131 335197 64.459
- I(GarageCars^2) 1 9543 331610 65.915
- I(LotArea^2) 1 10966 333032 68.510
- LotArea 1 20710 342777 86.286
- BasementSF 1 25906 347972 95.762
- factor(BasementHt) 3 33072 355138 104.835
- I(YearBuilt^2) 1 32353 354419 107.523
- factor(ExteriorQ) 3 36352 358418 110.819
- factor(Condition) 7 46123 368189 120.642
- I(GroundSF^2) 1 245999 568065 497.251
Step: AIC=47.26
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(Condition) + I(YearBuilt^2) + BasementSF + LotArea +
I(LotArea^2) + factor(KitchenQ) + I(GarageCars^2) + factor(BasementFin) +
TotalRooms
Df Sum of Sq RSS Cp
+ GroundSF 1 2642 316549 44.441
+ FullBath 1 1700 317491 46.160
<none> 319191 47.261
+ WoodDeckSF 1 1055 318136 47.336
+ YearBuilt 1 701 318490 47.982
+ GarageCars 1 254 318937 48.797
+ I(WoodDeckSF^2) 1 253 318938 48.799
+ I(BasementSF^2) 1 61 319130 49.149
- TotalRooms 1 2875 322066 50.506
+ factor(HouseStyle) 6 4292 314899 51.432
- factor(BasementFin) 5 10407 329598 56.246
- factor(KitchenQ) 3 10086 329277 59.660
- I(GarageCars^2) 1 9807 328998 63.150
- I(LotArea^2) 1 12793 331984 68.598
- LotArea 1 23031 342222 87.274
- BasementSF 1 24625 343816 90.181
- factor(BasementHt) 3 33636 352827 102.619
- I(YearBuilt^2) 1 31999 351190 103.632
- factor(ExteriorQ) 3 36175 355366 107.250
- factor(Condition) 7 44509 363700 114.453
- I(GroundSF^2) 1 146108 465299 311.788
Step: AIC=44.44
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(Condition) + I(YearBuilt^2) + BasementSF + LotArea +
I(LotArea^2) + factor(KitchenQ) + I(GarageCars^2) + factor(BasementFin) +
TotalRooms + GroundSF
Df Sum of Sq RSS Cp
+ FullBath 1 3126 313423 40.738
+ WoodDeckSF 1 1131 315418 44.377
<none> 316549 44.441
+ GarageCars 1 810 315739 44.963
+ YearBuilt 1 561 315987 45.417
+ I(WoodDeckSF^2) 1 310 316239 45.875
+ factor(HouseStyle) 6 5761 310787 45.931
+ I(BasementSF^2) 1 28 316521 46.390
- GroundSF 1 2642 319191 47.261
- TotalRooms 1 4829 321377 51.249
- factor(BasementFin) 5 10764 327313 54.077
- factor(KitchenQ) 3 10165 326714 56.983
- I(GroundSF^2) 1 8198 324747 57.396
- I(GarageCars^2) 1 9974 326523 60.635
- I(LotArea^2) 1 13354 329903 66.801
- BasementSF 1 21321 337870 81.334
- LotArea 1 23190 339738 84.743
- factor(BasementHt) 3 34226 350774 100.875
- I(YearBuilt^2) 1 32343 348892 101.440
- factor(ExteriorQ) 3 34851 351400 102.015
- factor(Condition) 7 44476 361025 111.573
Step: AIC=40.74
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(Condition) + I(YearBuilt^2) + BasementSF + LotArea +
I(LotArea^2) + factor(KitchenQ) + I(GarageCars^2) + factor(BasementFin) +
TotalRooms + GroundSF + FullBath
Df Sum of Sq RSS Cp
+ factor(HouseStyle) 6 6596 306826 40.705
<none> 313423 40.738
+ WoodDeckSF 1 1043 312379 40.835
+ YearBuilt 1 964 312458 40.979
+ GarageCars 1 830 312593 41.224
+ I(WoodDeckSF^2) 1 257 313165 42.269
+ I(BasementSF^2) 1 26 313397 42.691
- FullBath 1 3126 316549 44.441
- TotalRooms 1 3743 317165 45.565
- GroundSF 1 4069 317491 46.160
- factor(BasementFin) 5 9978 323401 48.940
- I(GroundSF^2) 1 6256 319678 50.150
- factor(KitchenQ) 3 10200 323622 53.344
- I(GarageCars^2) 1 10322 323744 57.567
- I(LotArea^2) 1 13377 326800 63.140
- BasementSF 1 22371 335793 79.546
- LotArea 1 22885 336307 80.484
- factor(ExteriorQ) 3 33897 347319 96.572
- factor(BasementHt) 3 35224 348646 98.993
- I(YearBuilt^2) 1 34895 348317 102.392
- factor(Condition) 7 44972 358394 108.775
Step: AIC=40.7
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(Condition) + I(YearBuilt^2) + BasementSF + LotArea +
I(LotArea^2) + factor(KitchenQ) + I(GarageCars^2) + factor(BasementFin) +
TotalRooms + GroundSF + FullBath + factor(HouseStyle)
Df Sum of Sq RSS Cp
+ GarageCars 1 1347 305479 40.248
+ YearBuilt 1 1184 305642 40.544
+ WoodDeckSF 1 1115 305711 40.672
<none> 306826 40.705
- factor(HouseStyle) 6 6596 313423 40.738
+ I(WoodDeckSF^2) 1 282 306544 42.191
+ I(BasementSF^2) 1 61 306765 42.593
- TotalRooms 1 3566 310392 45.209
- FullBath 1 3961 310787 45.931
- I(GroundSF^2) 1 4167 310993 46.306
- BasementSF 1 4489 311315 46.893
- GroundSF 1 6096 312922 49.825
- factor(BasementFin) 5 10552 317378 49.954
- factor(KitchenQ) 3 10141 316967 53.204
- I(GarageCars^2) 1 10914 317741 58.615
- I(LotArea^2) 1 13555 320381 63.431
- LotArea 1 21061 327887 77.124
- I(YearBuilt^2) 1 28577 335403 90.834
- factor(ExteriorQ) 3 33591 340417 95.981
- factor(BasementHt) 3 36372 343198 101.053
- factor(Condition) 7 42818 349644 104.813
Step: AIC=40.25
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(Condition) + I(YearBuilt^2) + BasementSF + LotArea +
I(LotArea^2) + factor(KitchenQ) + I(GarageCars^2) + factor(BasementFin) +
TotalRooms + GroundSF + FullBath + factor(HouseStyle) + GarageCars
Df Sum of Sq RSS Cp
+ WoodDeckSF 1 1114 304366 40.217
<none> 305479 40.248
+ YearBuilt 1 858 304622 40.684
- GarageCars 1 1347 306826 40.705
- factor(HouseStyle) 6 7113 312593 41.224
+ I(WoodDeckSF^2) 1 291 305189 41.718
+ I(BasementSF^2) 1 49 305431 42.159
- I(GroundSF^2) 1 3069 308548 43.846
- BasementSF 1 3838 309318 45.250
- TotalRooms 1 3924 309404 45.407
- FullBath 1 4020 309500 45.582
- I(GarageCars^2) 1 5331 310810 47.973
- factor(BasementFin) 5 11036 316516 50.381
- GroundSF 1 7039 312518 51.088
- factor(KitchenQ) 3 9791 315271 52.109
- I(LotArea^2) 1 13376 318855 62.648
- LotArea 1 21055 326535 76.657
- I(YearBuilt^2) 1 29912 335391 92.813
- factor(ExteriorQ) 3 33112 338592 94.651
- factor(BasementHt) 3 35133 340612 98.336
- factor(Condition) 7 43312 348791 105.257
Step: AIC=40.22
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(Condition) + I(YearBuilt^2) + BasementSF + LotArea +
I(LotArea^2) + factor(KitchenQ) + I(GarageCars^2) + factor(BasementFin) +
TotalRooms + GroundSF + FullBath + factor(HouseStyle) + GarageCars +
WoodDeckSF
Df Sum of Sq RSS Cp
<none> 304366 40.217
- WoodDeckSF 1 1114 305479 40.248
+ I(WoodDeckSF^2) 1 935 303431 40.511
+ YearBuilt 1 887 303479 40.599
- GarageCars 1 1346 305711 40.672
- factor(HouseStyle) 6 7179 311544 41.312
+ I(BasementSF^2) 1 17 304349 42.186
- I(GroundSF^2) 1 2870 307235 43.452
- BasementSF 1 3580 307946 44.747
- TotalRooms 1 3910 308275 45.349
- FullBath 1 3929 308295 45.384
- I(GarageCars^2) 1 5246 309611 47.786
- factor(BasementFin) 5 10450 314816 49.280
- GroundSF 1 7200 311566 51.351
- factor(KitchenQ) 3 9474 313840 51.499
- I(LotArea^2) 1 13021 317387 61.970
- LotArea 1 20431 324797 75.487
- I(YearBuilt^2) 1 30239 334604 93.377
- factor(ExteriorQ) 3 33011 337376 94.434
- factor(BasementHt) 3 34043 338408 96.317
- factor(Condition) 7 41713 346079 102.309
Call:
lm(formula = Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(Condition) + I(YearBuilt^2) + BasementSF + LotArea +
I(LotArea^2) + factor(KitchenQ) + I(GarageCars^2) + factor(BasementFin) +
TotalRooms + GroundSF + FullBath + factor(HouseStyle) + GarageCars +
WoodDeckSF, data = AmesTrain6a)
Coefficients:
(Intercept) factor(ExteriorQ)Fa factor(ExteriorQ)Gd factor(ExteriorQ)TA
-3.578e+02 -7.652e+01 -4.455e+01 -6.249e+01
I(GroundSF^2) factor(BasementHt)Fa factor(BasementHt)Gd factor(BasementHt)None
8.318e-06 -4.839e+01 -3.652e+01 -5.351e+01
factor(BasementHt)TA factor(Condition)3 factor(Condition)4 factor(Condition)5
-4.732e+01 5.008e+00 1.510e+01 2.705e+01
factor(Condition)6 factor(Condition)7 factor(Condition)8 factor(Condition)9
3.544e+01 4.263e+01 5.135e+01 6.339e+01
I(YearBuilt^2) BasementSF LotArea I(LotArea^2)
1.281e-04 1.360e-02 4.725e-03 -1.464e-07
factor(KitchenQ)Fa factor(KitchenQ)Gd factor(KitchenQ)TA I(GarageCars^2)
-3.115e+01 -2.127e+01 -2.217e+01 4.559e+00
factor(BasementFin)BLQ factor(BasementFin)GLQ factor(BasementFin)LwQ factor(BasementFin)None
7.728e-01 3.401e+00 -9.447e+00 NA
factor(BasementFin)Rec factor(BasementFin)Unf TotalRooms GroundSF
-9.769e-01 -7.594e+00 -3.331e+00 5.274e-02
FullBath factor(HouseStyle)1.5Unf factor(HouseStyle)1Story factor(HouseStyle)2.5Unf
-7.362e+00 2.897e+01 1.281e+01 1.156e+01
factor(HouseStyle)2Story factor(HouseStyle)SFoyer factor(HouseStyle)SLvl GarageCars
4.640e+00 1.714e+00 8.855e+00 -7.686e+00
WoodDeckSF
1.299e-02
Forward selection
MSE=(summary(modTransformFull)$sigma)^2
none=lm(Price~1,data=AmesTrain6a)
step(none,scope=list(upper=modTransformFull),scale=MSE, direction = "forward")
Start: AIC=4685.16
Price ~ 1
Df Sum of Sq RSS Cp
+ factor(ExteriorQ) 3 1606904 1286545 1759.9
+ factor(BasementHt) 4 1555913 1337536 1854.9
+ GroundSF 1 1427972 1465476 2082.3
+ I(GroundSF^2) 1 1394445 1499004 2143.4
+ factor(KitchenQ) 3 1305178 1588271 2310.3
+ I(GarageCars^2) 1 1276249 1617200 2359.1
+ GarageCars 1 1118553 1774895 2646.7
+ I(BasementSF^2) 1 1109930 1783519 2662.5
+ BasementSF 1 1010777 1882672 2843.3
+ I(YearBuilt^2) 1 1005760 1887689 2852.5
+ YearBuilt 1 996521 1896928 2869.3
+ FullBath 1 910528 1982921 3026.2
+ TotalRooms 1 613129 2280320 3568.7
+ factor(Condition) 7 546078 2347371 3703.0
+ factor(BasementFin) 6 515220 2378228 3757.3
+ WoodDeckSF 1 350672 2542776 4047.5
+ LotArea 1 291166 2602283 4156.0
+ factor(HouseStyle) 6 230783 2662666 4276.2
+ I(WoodDeckSF^2) 1 185418 2708031 4348.9
+ I(LotArea^2) 1 169321 2724127 4378.3
<none> 2893449 4685.2
Step: AIC=1759.89
Price ~ factor(ExteriorQ)
Df Sum of Sq RSS Cp
+ I(GroundSF^2) 1 559835 726710 740.65
+ GroundSF 1 551843 734702 755.23
+ I(GarageCars^2) 1 281632 1004913 1248.14
+ I(BasementSF^2) 1 265658 1020887 1277.28
+ BasementSF 1 253821 1032724 1298.87
+ TotalRooms 1 250074 1036472 1305.71
+ factor(BasementHt) 4 250784 1035761 1310.41
+ GarageCars 1 238478 1048067 1326.86
+ FullBath 1 202387 1084158 1392.70
+ LotArea 1 169637 1116908 1452.44
+ I(YearBuilt^2) 1 148483 1138062 1491.03
+ YearBuilt 1 147695 1138850 1492.46
+ factor(KitchenQ) 3 111799 1174746 1561.94
+ I(LotArea^2) 1 100624 1185921 1578.33
+ WoodDeckSF 1 86278 1200267 1604.50
+ I(WoodDeckSF^2) 1 71648 1214897 1631.19
+ factor(Condition) 7 56831 1229715 1670.22
+ factor(HouseStyle) 6 51999 1234546 1677.03
+ factor(BasementFin) 6 50313 1236232 1680.11
<none> 1286545 1759.89
Step: AIC=740.65
Price ~ factor(ExteriorQ) + I(GroundSF^2)
Df Sum of Sq RSS Cp
+ factor(BasementHt) 4 160720 565990 455.47
+ I(YearBuilt^2) 1 157397 569313 455.53
+ YearBuilt 1 157382 569329 455.56
+ BasementSF 1 121924 604786 520.24
+ I(BasementSF^2) 1 112159 614551 538.05
+ factor(HouseStyle) 6 113570 613140 545.48
+ factor(BasementFin) 6 92363 634347 584.16
+ I(GarageCars^2) 1 66215 660495 621.86
+ GarageCars 1 63555 663155 626.71
+ factor(KitchenQ) 3 56948 669762 642.77
+ factor(Condition) 7 53872 672838 656.38
+ WoodDeckSF 1 33869 692841 680.86
+ LotArea 1 30692 696018 686.66
+ I(WoodDeckSF^2) 1 16874 709836 711.87
+ TotalRooms 1 13037 713673 718.87
+ I(LotArea^2) 1 11312 715398 722.01
+ FullBath 1 6776 719934 730.29
+ GroundSF 1 3391 723319 736.46
<none> 726710 740.65
Step: AIC=455.47
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt)
Df Sum of Sq RSS Cp
+ factor(HouseStyle) 6 84078 481912 314.09
+ BasementSF 1 74839 491150 320.95
+ YearBuilt 1 66808 499182 335.60
+ I(YearBuilt^2) 1 66606 499384 335.96
+ I(BasementSF^2) 1 65947 500043 337.17
+ GarageCars 1 37702 528288 388.69
+ I(GarageCars^2) 1 37418 528572 389.21
+ factor(BasementFin) 5 39757 526233 392.94
+ LotArea 1 34700 531290 394.17
+ factor(Condition) 7 37467 528522 401.12
+ factor(KitchenQ) 3 29273 536717 408.07
+ WoodDeckSF 1 14324 551666 431.34
+ I(LotArea^2) 1 13501 552489 432.84
+ I(WoodDeckSF^2) 1 10439 555551 438.42
+ TotalRooms 1 5224 560766 447.94
+ GroundSF 1 3983 562007 450.20
<none> 565990 455.47
+ FullBath 1 200 565790 457.10
Step: AIC=314.09
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle)
Df Sum of Sq RSS Cp
+ factor(Condition) 7 38427 443485 257.99
+ YearBuilt 1 30128 451783 261.13
+ I(YearBuilt^2) 1 30095 451816 261.19
+ factor(BasementFin) 5 26815 455097 275.18
+ factor(KitchenQ) 3 24506 457406 275.39
+ GarageCars 1 20826 461085 278.10
+ I(GarageCars^2) 1 20337 461574 278.99
+ BasementSF 1 19221 462690 281.03
+ I(BasementSF^2) 1 13778 468134 290.96
+ GroundSF 1 12741 469171 292.85
+ LotArea 1 12573 469338 293.16
+ WoodDeckSF 1 8474 473438 300.63
+ I(WoodDeckSF^2) 1 4728 477184 307.47
+ I(LotArea^2) 1 2753 479159 311.07
<none> 481912 314.09
+ TotalRooms 1 1025 480887 314.22
+ FullBath 1 125 481787 315.86
Step: AIC=257.99
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition)
Df Sum of Sq RSS Cp
+ I(YearBuilt^2) 1 47529 395956 173.29
+ YearBuilt 1 47420 396065 173.49
+ BasementSF 1 25980 417505 212.60
+ I(GarageCars^2) 1 21214 422271 221.30
+ GarageCars 1 21029 422456 221.63
+ I(BasementSF^2) 1 19924 423561 223.65
+ factor(BasementFin) 5 20154 423331 231.23
+ factor(KitchenQ) 3 16659 426826 233.61
+ LotArea 1 14369 429116 233.78
+ GroundSF 1 13497 429988 235.37
+ WoodDeckSF 1 5209 438276 250.49
+ I(LotArea^2) 1 3787 439697 253.09
+ I(WoodDeckSF^2) 1 2155 441329 256.06
<none> 443485 257.99
+ TotalRooms 1 437 443048 259.20
+ FullBath 1 97 443388 259.82
Step: AIC=173.29
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2)
Df Sum of Sq RSS Cp
+ BasementSF 1 19141.4 376814 140.38
+ LotArea 1 17162.5 378793 143.99
+ I(BasementSF^2) 1 15212.6 380743 147.54
+ I(GarageCars^2) 1 14269.5 381686 149.26
+ GarageCars 1 11136.3 384819 154.98
+ GroundSF 1 10966.0 384990 155.29
+ factor(KitchenQ) 3 11718.3 384237 157.92
+ factor(BasementFin) 5 13148.7 382807 159.31
+ I(LotArea^2) 1 6413.0 389543 163.59
+ WoodDeckSF 1 5133.3 390822 165.93
+ I(WoodDeckSF^2) 1 2603.8 393352 170.54
+ FullBath 1 1859.1 394097 171.90
<none> 395956 173.29
+ TotalRooms 1 648.3 395307 174.11
+ YearBuilt 1 402.5 395553 174.56
Step: AIC=140.38
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF
Df Sum of Sq RSS Cp
+ LotArea 1 15952.0 360862 113.28
+ I(GarageCars^2) 1 9569.7 367245 124.92
+ factor(KitchenQ) 3 11622.0 365192 125.18
+ GarageCars 1 7546.2 369268 128.61
+ factor(BasementFin) 5 11477.0 365337 129.44
+ I(LotArea^2) 1 6477.0 370337 130.56
+ WoodDeckSF 1 3869.5 372945 135.32
+ FullBath 1 3450.1 373364 136.08
+ GroundSF 1 3147.1 373667 136.63
+ I(WoodDeckSF^2) 1 1520.3 375294 139.60
+ TotalRooms 1 1403.4 375411 139.82
<none> 376814 140.38
+ I(BasementSF^2) 1 578.7 376236 141.32
+ YearBuilt 1 324.7 376490 141.78
Step: AIC=113.28
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea
Df Sum of Sq RSS Cp
+ I(LotArea^2) 1 10707.5 350155 95.744
+ factor(KitchenQ) 3 12062.7 348800 97.272
+ I(GarageCars^2) 1 7903.0 352959 100.860
+ factor(BasementFin) 5 11062.9 349799 103.096
+ GarageCars 1 5588.2 355274 105.083
+ FullBath 1 3607.4 357255 108.696
+ WoodDeckSF 1 3145.9 357716 109.538
+ TotalRooms 1 2410.8 358452 110.879
<none> 360862 113.277
+ GroundSF 1 902.6 359960 113.630
+ I(WoodDeckSF^2) 1 870.8 359991 113.688
+ YearBuilt 1 592.9 360269 114.195
+ I(BasementSF^2) 1 353.9 360508 114.631
Step: AIC=95.74
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2)
Df Sum of Sq RSS Cp
+ factor(KitchenQ) 3 11542.5 338612 80.689
+ factor(BasementFin) 5 12613.1 337542 82.736
+ I(GarageCars^2) 1 7747.1 342408 83.612
+ GarageCars 1 5656.0 344499 87.427
+ TotalRooms 1 4897.2 345258 88.811
+ FullBath 1 4220.3 345935 90.046
+ WoodDeckSF 1 2703.5 347451 92.812
<none> 350155 95.744
+ I(WoodDeckSF^2) 1 896.4 349258 96.109
+ GroundSF 1 696.8 349458 96.473
+ YearBuilt 1 373.0 349782 97.064
+ I(BasementSF^2) 1 3.7 350151 97.737
Step: AIC=80.69
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ)
Df Sum of Sq RSS Cp
+ I(GarageCars^2) 1 7655.1 330957 68.724
+ factor(BasementFin) 5 11288.4 327324 70.097
+ GarageCars 1 5847.9 332764 72.021
+ TotalRooms 1 4509.9 334102 74.462
+ FullBath 1 4213.5 334399 75.002
+ WoodDeckSF 1 2158.8 336454 78.751
<none> 338612 80.689
+ GroundSF 1 832.9 337779 81.169
+ I(WoodDeckSF^2) 1 822.9 337789 81.188
+ YearBuilt 1 106.8 338505 82.494
+ I(BasementSF^2) 1 3.5 338609 82.682
Step: AIC=68.72
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2)
Df Sum of Sq RSS Cp
+ factor(BasementFin) 5 13552.3 317405 54.002
+ TotalRooms 1 5177.0 325780 61.281
+ FullBath 1 4754.9 326202 62.051
+ WoodDeckSF 1 1838.1 329119 67.371
<none> 330957 68.724
+ GroundSF 1 918.9 330038 69.048
+ I(WoodDeckSF^2) 1 730.4 330227 69.392
+ GarageCars 1 139.3 330818 70.470
+ I(BasementSF^2) 1 42.2 330915 70.647
+ YearBuilt 1 7.4 330950 70.711
Step: AIC=54
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin)
Df Sum of Sq RSS Cp
+ FullBath 1 3012.04 314393 50.508
+ TotalRooms 1 2505.48 314899 51.432
+ GroundSF 1 1836.90 315568 52.652
+ WoodDeckSF 1 1111.30 316293 53.975
<none> 317405 54.002
+ YearBuilt 1 485.74 316919 55.116
+ GarageCars 1 442.96 316962 55.194
+ I(WoodDeckSF^2) 1 296.71 317108 55.461
+ I(BasementSF^2) 1 111.45 317293 55.799
Step: AIC=50.51
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin) + FullBath
Df Sum of Sq RSS Cp
+ GroundSF 1 4001.2 310392 45.209
+ TotalRooms 1 1470.8 312922 49.825
<none> 314393 50.508
+ WoodDeckSF 1 991.6 313401 50.699
+ YearBuilt 1 939.9 313453 50.793
+ GarageCars 1 365.1 314028 51.842
+ I(WoodDeckSF^2) 1 220.8 314172 52.105
+ I(BasementSF^2) 1 126.7 314266 52.277
Step: AIC=45.21
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin) + FullBath + GroundSF
Df Sum of Sq RSS Cp
+ TotalRooms 1 3565.6 306826 40.705
+ WoodDeckSF 1 1128.4 309263 45.151
<none> 310392 45.209
+ GarageCars 1 987.8 309404 45.407
+ YearBuilt 1 906.8 309485 45.555
+ I(WoodDeckSF^2) 1 311.9 310080 46.640
+ I(BasementSF^2) 1 78.1 310313 47.067
Step: AIC=40.7
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin) + FullBath + GroundSF +
TotalRooms
Df Sum of Sq RSS Cp
+ GarageCars 1 1346.69 305479 40.248
+ YearBuilt 1 1184.41 305642 40.544
+ WoodDeckSF 1 1114.54 305711 40.672
<none> 306826 40.705
+ I(WoodDeckSF^2) 1 281.70 306544 42.191
+ I(BasementSF^2) 1 61.43 306765 42.593
Step: AIC=40.25
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin) + FullBath + GroundSF +
TotalRooms + GarageCars
Df Sum of Sq RSS Cp
+ WoodDeckSF 1 1113.64 304366 40.217
<none> 305479 40.248
+ YearBuilt 1 857.57 304622 40.684
+ I(WoodDeckSF^2) 1 290.63 305189 41.718
+ I(BasementSF^2) 1 48.76 305431 42.159
Step: AIC=40.22
Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin) + FullBath + GroundSF +
TotalRooms + GarageCars + WoodDeckSF
Df Sum of Sq RSS Cp
<none> 304366 40.217
+ I(WoodDeckSF^2) 1 935.01 303431 40.511
+ YearBuilt 1 887.06 303479 40.599
+ I(BasementSF^2) 1 17.03 304349 42.186
Call:
lm(formula = Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin) + FullBath + GroundSF +
TotalRooms + GarageCars + WoodDeckSF, data = AmesTrain6a)
Coefficients:
(Intercept) factor(ExteriorQ)Fa factor(ExteriorQ)Gd factor(ExteriorQ)TA
-3.578e+02 -7.652e+01 -4.455e+01 -6.249e+01
I(GroundSF^2) factor(BasementHt)Fa factor(BasementHt)Gd factor(BasementHt)None
8.318e-06 -4.839e+01 -3.652e+01 -5.351e+01
factor(BasementHt)TA factor(HouseStyle)1.5Unf factor(HouseStyle)1Story factor(HouseStyle)2.5Unf
-4.732e+01 2.897e+01 1.281e+01 1.156e+01
factor(HouseStyle)2Story factor(HouseStyle)SFoyer factor(HouseStyle)SLvl factor(Condition)3
4.640e+00 1.714e+00 8.855e+00 5.008e+00
factor(Condition)4 factor(Condition)5 factor(Condition)6 factor(Condition)7
1.510e+01 2.705e+01 3.544e+01 4.263e+01
factor(Condition)8 factor(Condition)9 I(YearBuilt^2) BasementSF
5.135e+01 6.339e+01 1.281e-04 1.360e-02
LotArea I(LotArea^2) factor(KitchenQ)Fa factor(KitchenQ)Gd
4.725e-03 -1.464e-07 -3.115e+01 -2.127e+01
factor(KitchenQ)TA I(GarageCars^2) factor(BasementFin)BLQ factor(BasementFin)GLQ
-2.217e+01 4.559e+00 7.728e-01 3.401e+00
factor(BasementFin)LwQ factor(BasementFin)None factor(BasementFin)Rec factor(BasementFin)Unf
-9.447e+00 NA -9.769e-01 -7.594e+00
FullBath GroundSF TotalRooms GarageCars
-7.362e+00 5.274e-02 -3.331e+00 -7.686e+00
WoodDeckSF
1.299e-02
Backward selection
MSE=(summary(modTransformFull)$sigma)^2
step(modTransformFull,scale=MSE)
Start: AIC=43
Price ~ LotArea + I(LotArea^2) + YearBuilt + I(YearBuilt^2) +
BasementSF + I(BasementSF^2) + GarageCars + I(GarageCars^2) +
WoodDeckSF + I(WoodDeckSF^2) + GroundSF + I(GroundSF^2) +
FullBath + TotalRooms + factor(HouseStyle) + factor(ExteriorQ) +
factor(BasementFin) + factor(KitchenQ) + factor(BasementHt) +
factor(Condition)
Df Sum of Sq RSS Cp
- I(BasementSF^2) 1 63 302665 41.115
- BasementSF 1 198 302800 41.361
- YearBuilt 1 759 303361 42.384
- I(WoodDeckSF^2) 1 861 303463 42.570
- I(YearBuilt^2) 1 872 303474 42.590
- GarageCars 1 990 303592 42.805
<none> 302602 43.000
- WoodDeckSF 1 1663 304265 44.033
- factor(HouseStyle) 6 7307 309909 44.329
- I(GroundSF^2) 1 3281 305884 46.986
- TotalRooms 1 4175 306777 48.616
- I(GarageCars^2) 1 4235 306837 48.725
- FullBath 1 4376 306979 48.983
- factor(KitchenQ) 3 8634 311236 52.750
- factor(BasementFin) 5 11177 313779 53.389
- GroundSF 1 6894 309496 53.576
- I(LotArea^2) 1 12250 314853 63.347
- LotArea 1 19813 322415 77.142
- factor(ExteriorQ) 3 27985 330587 88.049
- factor(BasementHt) 3 28051 330653 88.170
- factor(Condition) 7 43156 345758 107.724
Step: AIC=41.12
Price ~ LotArea + I(LotArea^2) + YearBuilt + I(YearBuilt^2) +
BasementSF + GarageCars + I(GarageCars^2) + WoodDeckSF +
I(WoodDeckSF^2) + GroundSF + I(GroundSF^2) + FullBath + TotalRooms +
factor(HouseStyle) + factor(ExteriorQ) + factor(BasementFin) +
factor(KitchenQ) + factor(BasementHt) + factor(Condition)
Df Sum of Sq RSS Cp
- YearBuilt 1 765 303431 40.511
- I(WoodDeckSF^2) 1 813 303479 40.599
- I(YearBuilt^2) 1 879 303544 40.718
- GarageCars 1 1002 303668 40.943
<none> 302665 41.115
- WoodDeckSF 1 1619 304285 42.069
- factor(HouseStyle) 6 7268 309934 42.374
- I(GroundSF^2) 1 3248 305914 45.041
- BasementSF 1 3892 306558 46.215
- TotalRooms 1 4192 306858 46.763
- I(GarageCars^2) 1 4276 306941 46.915
- FullBath 1 4372 307038 47.091
- factor(KitchenQ) 3 8646 311311 50.887
- factor(BasementFin) 5 11159 313824 51.471
- GroundSF 1 6982 309647 51.851
- I(LotArea^2) 1 12285 314951 61.526
- LotArea 1 20025 322690 75.644
- factor(ExteriorQ) 3 28333 330998 86.800
- factor(BasementHt) 3 29077 331742 88.157
- factor(Condition) 7 43098 345764 105.734
Step: AIC=40.51
Price ~ LotArea + I(LotArea^2) + I(YearBuilt^2) + BasementSF +
GarageCars + I(GarageCars^2) + WoodDeckSF + I(WoodDeckSF^2) +
GroundSF + I(GroundSF^2) + FullBath + TotalRooms + factor(HouseStyle) +
factor(ExteriorQ) + factor(BasementFin) + factor(KitchenQ) +
factor(BasementHt) + factor(Condition)
Df Sum of Sq RSS Cp
- I(WoodDeckSF^2) 1 935 304366 40.217
<none> 303431 40.511
- GarageCars 1 1306 304737 40.894
- factor(HouseStyle) 6 7176 310607 41.602
- WoodDeckSF 1 1758 305189 41.718
- I(GroundSF^2) 1 3073 306504 44.117
- BasementSF 1 3862 307293 45.556
- FullBath 1 3980 307411 45.772
- TotalRooms 1 4005 307436 45.817
- I(GarageCars^2) 1 5110 308541 47.833
- factor(BasementFin) 5 10521 313952 49.704
- factor(KitchenQ) 3 8931 312361 50.802
- GroundSF 1 7056 310487 51.383
- I(LotArea^2) 1 12375 315806 61.086
- LotArea 1 19968 323399 74.937
- factor(BasementHt) 3 31977 335407 92.842
- I(YearBuilt^2) 1 29816 333247 92.901
- factor(ExteriorQ) 3 32715 336146 94.190
- factor(Condition) 7 42338 345769 103.743
Step: AIC=40.22
Price ~ LotArea + I(LotArea^2) + I(YearBuilt^2) + BasementSF +
GarageCars + I(GarageCars^2) + WoodDeckSF + GroundSF + I(GroundSF^2) +
FullBath + TotalRooms + factor(HouseStyle) + factor(ExteriorQ) +
factor(BasementFin) + factor(KitchenQ) + factor(BasementHt) +
factor(Condition)
Df Sum of Sq RSS Cp
<none> 304366 40.217
- WoodDeckSF 1 1114 305479 40.248
- GarageCars 1 1346 305711 40.672
- factor(HouseStyle) 6 7179 311544 41.312
- I(GroundSF^2) 1 2870 307235 43.452
- BasementSF 1 3580 307946 44.747
- TotalRooms 1 3910 308275 45.349
- FullBath 1 3929 308295 45.384
- I(GarageCars^2) 1 5246 309611 47.786
- factor(BasementFin) 5 10450 314816 49.280
- GroundSF 1 7200 311566 51.351
- factor(KitchenQ) 3 9474 313840 51.499
- I(LotArea^2) 1 13021 317387 61.970
- LotArea 1 20431 324797 75.487
- I(YearBuilt^2) 1 30239 334604 93.377
- factor(ExteriorQ) 3 33011 337376 94.434
- factor(BasementHt) 3 34043 338408 96.317
- factor(Condition) 7 41713 346079 102.309
Call:
lm(formula = Price ~ LotArea + I(LotArea^2) + I(YearBuilt^2) +
BasementSF + GarageCars + I(GarageCars^2) + WoodDeckSF +
GroundSF + I(GroundSF^2) + FullBath + TotalRooms + factor(HouseStyle) +
factor(ExteriorQ) + factor(BasementFin) + factor(KitchenQ) +
factor(BasementHt) + factor(Condition), data = AmesTrain6a)
Coefficients:
(Intercept) LotArea I(LotArea^2) I(YearBuilt^2)
-3.578e+02 4.725e-03 -1.464e-07 1.281e-04
BasementSF GarageCars I(GarageCars^2) WoodDeckSF
1.360e-02 -7.686e+00 4.559e+00 1.299e-02
GroundSF I(GroundSF^2) FullBath TotalRooms
5.274e-02 8.318e-06 -7.362e+00 -3.331e+00
factor(HouseStyle)1.5Unf factor(HouseStyle)1Story factor(HouseStyle)2.5Unf factor(HouseStyle)2Story
2.897e+01 1.281e+01 1.156e+01 4.640e+00
factor(HouseStyle)SFoyer factor(HouseStyle)SLvl factor(ExteriorQ)Fa factor(ExteriorQ)Gd
1.714e+00 8.855e+00 -7.652e+01 -4.455e+01
factor(ExteriorQ)TA factor(BasementFin)BLQ factor(BasementFin)GLQ factor(BasementFin)LwQ
-6.249e+01 7.728e-01 3.401e+00 -9.447e+00
factor(BasementFin)None factor(BasementFin)Rec factor(BasementFin)Unf factor(KitchenQ)Fa
-5.351e+01 -9.769e-01 -7.594e+00 -3.115e+01
factor(KitchenQ)Gd factor(KitchenQ)TA factor(BasementHt)Fa factor(BasementHt)Gd
-2.127e+01 -2.217e+01 -4.839e+01 -3.652e+01
factor(BasementHt)None factor(BasementHt)TA factor(Condition)3 factor(Condition)4
NA -4.732e+01 5.008e+00 1.510e+01
factor(Condition)5 factor(Condition)6 factor(Condition)7 factor(Condition)8
2.705e+01 3.544e+01 4.263e+01 5.135e+01
factor(Condition)9
6.339e+01
Backward selection
BackwardMod = lm(Price ~ LotArea + I(LotArea^2) + I(YearBuilt^2) + BasementSF +
GarageCars + I(GarageCars^2) + WoodDeckSF + GroundSF + I(GroundSF^2) +
FullBath + TotalRooms + factor(HouseStyle) + factor(ExteriorQ) +
factor(BasementFin) + factor(KitchenQ) + factor(BasementHt) +
factor(Condition), data=AmesTrain6a)
summary(BackwardMod)
Call:
lm(formula = Price ~ LotArea + I(LotArea^2) + I(YearBuilt^2) +
BasementSF + GarageCars + I(GarageCars^2) + WoodDeckSF +
GroundSF + I(GroundSF^2) + FullBath + TotalRooms + factor(HouseStyle) +
factor(ExteriorQ) + factor(BasementFin) + factor(KitchenQ) +
factor(BasementHt) + factor(Condition), data = AmesTrain6a)
Residuals:
Min 1Q Median 3Q Max
-84.323 -11.695 -1.128 11.311 111.963
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.578e+02 7.287e+01 -4.910 1.20e-06 ***
LotArea 4.725e-03 7.742e-04 6.104 1.95e-09 ***
I(LotArea^2) -1.464e-07 3.004e-08 -4.873 1.44e-06 ***
I(YearBuilt^2) 1.281e-04 1.726e-05 7.426 4.26e-13 ***
BasementSF 1.360e-02 5.324e-03 2.555 0.010885 *
GarageCars -7.686e+00 4.907e+00 -1.567 0.117796
I(GarageCars^2) 4.559e+00 1.474e+00 3.093 0.002083 **
WoodDeckSF 1.299e-02 9.113e-03 1.425 0.154713
GroundSF 5.274e-02 1.456e-02 3.623 0.000318 ***
I(GroundSF^2) 8.318e-06 3.636e-06 2.287 0.022543 *
FullBath -7.362e+00 2.750e+00 -2.677 0.007657 **
TotalRooms -3.331e+00 1.248e+00 -2.670 0.007805 **
factor(HouseStyle)1.5Unf 2.897e+01 1.160e+01 2.497 0.012826 *
factor(HouseStyle)1Story 1.281e+01 4.702e+00 2.725 0.006637 **
factor(HouseStyle)2.5Unf 1.156e+01 1.461e+01 0.791 0.429161
factor(HouseStyle)2Story 4.640e+00 4.357e+00 1.065 0.287368
factor(HouseStyle)SFoyer 1.714e+00 7.480e+00 0.229 0.818817
factor(HouseStyle)SLvl 8.855e+00 5.570e+00 1.590 0.112434
factor(ExteriorQ)Fa -7.652e+01 1.346e+01 -5.684 2.13e-08 ***
factor(ExteriorQ)Gd -4.455e+01 8.171e+00 -5.452 7.52e-08 ***
factor(ExteriorQ)TA -6.249e+01 8.774e+00 -7.122 3.30e-12 ***
factor(BasementFin)BLQ 7.728e-01 4.204e+00 0.184 0.854217
factor(BasementFin)GLQ 3.401e+00 3.363e+00 1.011 0.312315
factor(BasementFin)LwQ -9.447e+00 5.284e+00 -1.788 0.074359 .
factor(BasementFin)None -5.351e+01 1.088e+01 -4.918 1.15e-06 ***
factor(BasementFin)Rec -9.769e-01 4.501e+00 -0.217 0.828267
factor(BasementFin)Unf -7.594e+00 3.189e+00 -2.382 0.017571 *
factor(KitchenQ)Fa -3.115e+01 8.731e+00 -3.568 0.000391 ***
factor(KitchenQ)Gd -2.127e+01 5.698e+00 -3.733 0.000209 ***
factor(KitchenQ)TA -2.217e+01 5.868e+00 -3.779 0.000175 ***
factor(BasementHt)Fa -4.839e+01 8.853e+00 -5.465 6.99e-08 ***
factor(BasementHt)Gd -3.652e+01 5.092e+00 -7.172 2.37e-12 ***
factor(BasementHt)None NA NA NA NA
factor(BasementHt)TA -4.732e+01 6.135e+00 -7.714 5.70e-14 ***
factor(Condition)3 5.008e+00 2.711e+01 0.185 0.853526
factor(Condition)4 1.510e+01 2.651e+01 0.570 0.569197
factor(Condition)5 2.705e+01 2.626e+01 1.030 0.303407
factor(Condition)6 3.544e+01 2.632e+01 1.346 0.178711
factor(Condition)7 4.263e+01 2.634e+01 1.618 0.106134
factor(Condition)8 5.135e+01 2.641e+01 1.945 0.052332 .
factor(Condition)9 6.339e+01 2.722e+01 2.329 0.020235 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 23.42 on 555 degrees of freedom
Multiple R-squared: 0.8948, Adjusted R-squared: 0.8874
F-statistic: 121.1 on 39 and 555 DF, p-value: < 2.2e-16
StepwiseMod = lm(Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin) + FullBath + GroundSF +
TotalRooms + GarageCars + WoodDeckSF, data=AmesTrain6a)
summary(StepwiseMod)
Call:
lm(formula = Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin) + FullBath + GroundSF +
TotalRooms + GarageCars + WoodDeckSF, data = AmesTrain6a)
Residuals:
Min 1Q Median 3Q Max
-84.323 -11.695 -1.128 11.311 111.963
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.578e+02 7.287e+01 -4.910 1.20e-06 ***
factor(ExteriorQ)Fa -7.652e+01 1.346e+01 -5.684 2.13e-08 ***
factor(ExteriorQ)Gd -4.455e+01 8.171e+00 -5.452 7.52e-08 ***
factor(ExteriorQ)TA -6.249e+01 8.774e+00 -7.122 3.30e-12 ***
I(GroundSF^2) 8.318e-06 3.636e-06 2.287 0.022543 *
factor(BasementHt)Fa -4.839e+01 8.853e+00 -5.465 6.99e-08 ***
factor(BasementHt)Gd -3.652e+01 5.092e+00 -7.172 2.37e-12 ***
factor(BasementHt)None -5.351e+01 1.088e+01 -4.918 1.15e-06 ***
factor(BasementHt)TA -4.732e+01 6.135e+00 -7.714 5.70e-14 ***
factor(HouseStyle)1.5Unf 2.897e+01 1.160e+01 2.497 0.012826 *
factor(HouseStyle)1Story 1.281e+01 4.702e+00 2.725 0.006637 **
factor(HouseStyle)2.5Unf 1.156e+01 1.461e+01 0.791 0.429161
factor(HouseStyle)2Story 4.640e+00 4.357e+00 1.065 0.287368
factor(HouseStyle)SFoyer 1.714e+00 7.480e+00 0.229 0.818817
factor(HouseStyle)SLvl 8.855e+00 5.570e+00 1.590 0.112434
factor(Condition)3 5.008e+00 2.711e+01 0.185 0.853526
factor(Condition)4 1.510e+01 2.651e+01 0.570 0.569197
factor(Condition)5 2.705e+01 2.626e+01 1.030 0.303407
factor(Condition)6 3.544e+01 2.632e+01 1.346 0.178711
factor(Condition)7 4.263e+01 2.634e+01 1.618 0.106134
factor(Condition)8 5.135e+01 2.641e+01 1.945 0.052332 .
factor(Condition)9 6.339e+01 2.722e+01 2.329 0.020235 *
I(YearBuilt^2) 1.281e-04 1.726e-05 7.426 4.26e-13 ***
BasementSF 1.360e-02 5.324e-03 2.555 0.010885 *
LotArea 4.725e-03 7.742e-04 6.104 1.95e-09 ***
I(LotArea^2) -1.464e-07 3.004e-08 -4.873 1.44e-06 ***
factor(KitchenQ)Fa -3.115e+01 8.731e+00 -3.568 0.000391 ***
factor(KitchenQ)Gd -2.127e+01 5.698e+00 -3.733 0.000209 ***
factor(KitchenQ)TA -2.217e+01 5.868e+00 -3.779 0.000175 ***
I(GarageCars^2) 4.559e+00 1.474e+00 3.093 0.002083 **
factor(BasementFin)BLQ 7.728e-01 4.204e+00 0.184 0.854217
factor(BasementFin)GLQ 3.401e+00 3.363e+00 1.011 0.312315
factor(BasementFin)LwQ -9.447e+00 5.284e+00 -1.788 0.074359 .
factor(BasementFin)None NA NA NA NA
factor(BasementFin)Rec -9.769e-01 4.501e+00 -0.217 0.828267
factor(BasementFin)Unf -7.594e+00 3.189e+00 -2.382 0.017571 *
FullBath -7.362e+00 2.750e+00 -2.677 0.007657 **
GroundSF 5.274e-02 1.456e-02 3.623 0.000318 ***
TotalRooms -3.331e+00 1.248e+00 -2.670 0.007805 **
GarageCars -7.686e+00 4.907e+00 -1.567 0.117796
WoodDeckSF 1.299e-02 9.113e-03 1.425 0.154713
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 23.42 on 555 degrees of freedom
Multiple R-squared: 0.8948, Adjusted R-squared: 0.8874
F-statistic: 121.1 on 39 and 555 DF, p-value: < 2.2e-16
ForwardMod= lm(Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin) + FullBath + GroundSF +
TotalRooms + GarageCars, data=AmesTrain6a)
summary(ForwardMod)
Call:
lm(formula = Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin) + FullBath + GroundSF +
TotalRooms + GarageCars, data = AmesTrain6a)
Residuals:
Min 1Q Median 3Q Max
-82.839 -12.255 -0.778 11.658 110.462
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.536e+02 7.288e+01 -4.851 1.59e-06 ***
factor(ExteriorQ)Fa -7.716e+01 1.347e+01 -5.729 1.66e-08 ***
factor(ExteriorQ)Gd -4.493e+01 8.174e+00 -5.496 5.92e-08 ***
factor(ExteriorQ)TA -6.272e+01 8.780e+00 -7.143 2.87e-12 ***
I(GroundSF^2) 8.590e-06 3.635e-06 2.363 0.018454 *
factor(BasementHt)Fa -4.930e+01 8.838e+00 -5.578 3.81e-08 ***
factor(BasementHt)Gd -3.667e+01 5.096e+00 -7.196 2.01e-12 ***
factor(BasementHt)None -5.439e+01 1.087e+01 -5.003 7.60e-07 ***
factor(BasementHt)TA -4.812e+01 6.114e+00 -7.871 1.85e-14 ***
factor(HouseStyle)1.5Unf 2.831e+01 1.160e+01 2.439 0.015028 *
factor(HouseStyle)1Story 1.286e+01 4.706e+00 2.733 0.006474 **
factor(HouseStyle)2.5Unf 1.108e+01 1.462e+01 0.758 0.448730
factor(HouseStyle)2Story 4.800e+00 4.360e+00 1.101 0.271413
factor(HouseStyle)SFoyer 1.938e+00 7.485e+00 0.259 0.795765
factor(HouseStyle)SLvl 9.936e+00 5.523e+00 1.799 0.072551 .
factor(Condition)3 4.885e+00 2.714e+01 0.180 0.857209
factor(Condition)4 1.515e+01 2.654e+01 0.571 0.568369
factor(Condition)5 2.728e+01 2.628e+01 1.038 0.299756
factor(Condition)6 3.565e+01 2.635e+01 1.353 0.176596
factor(Condition)7 4.295e+01 2.636e+01 1.629 0.103845
factor(Condition)8 5.208e+01 2.643e+01 1.971 0.049257 *
factor(Condition)9 6.437e+01 2.724e+01 2.363 0.018448 *
I(YearBuilt^2) 1.274e-04 1.726e-05 7.378 5.86e-13 ***
BasementSF 1.406e-02 5.319e-03 2.643 0.008446 **
LotArea 4.789e-03 7.736e-04 6.191 1.17e-09 ***
I(LotArea^2) -1.482e-07 3.004e-08 -4.934 1.07e-06 ***
factor(KitchenQ)Fa -3.194e+01 8.721e+00 -3.662 0.000274 ***
factor(KitchenQ)Gd -2.141e+01 5.702e+00 -3.754 0.000192 ***
factor(KitchenQ)TA -2.246e+01 5.869e+00 -3.827 0.000144 ***
I(GarageCars^2) 4.595e+00 1.475e+00 3.115 0.001935 **
factor(BasementFin)BLQ 8.770e-01 4.207e+00 0.208 0.834937
factor(BasementFin)GLQ 3.297e+00 3.366e+00 0.980 0.327636
factor(BasementFin)LwQ -9.818e+00 5.282e+00 -1.859 0.063603 .
factor(BasementFin)None NA NA NA NA
factor(BasementFin)Rec -1.153e+00 4.504e+00 -0.256 0.798004
factor(BasementFin)Unf -7.892e+00 3.185e+00 -2.478 0.013503 *
FullBath -7.445e+00 2.752e+00 -2.705 0.007040 **
GroundSF 5.212e-02 1.456e-02 3.579 0.000375 ***
TotalRooms -3.337e+00 1.249e+00 -2.673 0.007747 **
GarageCars -7.689e+00 4.911e+00 -1.566 0.118012
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 23.44 on 556 degrees of freedom
Multiple R-squared: 0.8944, Adjusted R-squared: 0.8872
F-statistic: 124 on 38 and 556 DF, p-value: < 2.2e-16
The forward selection model has the best parsimony and suggests a 16-variable mod with an r-squared value of .887, an AIC of 40.22, and a Mallow CP of ~40. (This is one fewer variable but a slightly worse (.002) adjusted r-squared than the models suggested by stepwise and backward selection.) We chose the backward selection model because it uses the fewest variables, which is best for parsimony, especially given how many variables we could use.
mean(ForwardMod$residuals)
[1] -1.845529e-16
sd(ForwardMod$residuals)
[1] 22.53261
mean(modTransformFull$residuals)
[1] 4.097209e-16
sd(modTransformFull$residuals)
[1] 22.57059
plot(ForwardMod$residuals)
abline(0,0)
ShrunkenMod=lm(Price ~ factor(ExteriorQ) + I(GroundSF^2) + factor(BasementHt) +
factor(HouseStyle) + factor(Condition) + I(YearBuilt^2) +
BasementSF + LotArea + I(LotArea^2) + factor(KitchenQ) +
I(GarageCars^2) + factor(BasementFin) + FullBath + GroundSF +
TotalRooms + GarageCars, data=AmesTrain6)
RefitAmes=predict.lm(ShrunkenMod, newdata=AmesTest6)
prediction from a rank-deficient fit may be misleading
cor(AmesTest6$Price,RefitAmes)
[1] 0.9135647
crosscorr=cor(AmesTest6$Price,RefitAmes)
cor(log(AmesTest6$Price),RefitAmes)
[1] 0.9045966
crosscorr=cor(AmesTest6$Price,RefitAmes)
crosscorr^2
[1] 0.8346004
.8872-crosscorr^2
[1] 0.05259962
The chunk beginning at line 312 shows us that having a model with fewer variables (ForwardMod) does not dramatically impact the residual values of either the mean or the standard deviation. The mean residuals of the ForwardMod is very close to zero, indicating that there is normality in the data. The plot of the residuals backs up this assumption, since it doesn’t show any fanning pattern or skew. The standard deviation is also small relative to the size of the data. These are all good signs and encourage us to use ForwardMod. ForwardMod is a also a more efficent model than the full mod because it has fewer variables for essentially the same r-squared value (.002 difference). The cross-validation of the model also shows shrinkage of the ForwardMod as a healthy sign, since it shows only a .05 difference between the model we made and the holdout data. This means we did not overfit the model.
We chose not to make any more adjustments to our model, because we think it does a good job balancing the number of variables it uses and predictive ability. This conclusion was supported by normality in the residuals and negative shrinkage. The adjusted r-squared is high and we have relatively good parsimony.
newpredictiondata= data.frame(ExteriorQ="Gd", BasementHt="Ex", Condition=5, YearBuilt=1995, BasementSF=1150, KitchenQ="TA", GarageCars=2, BasementFin="Unf", TotalRooms=9, GroundSF=2314, FullBath=2, HouseStyle="2Story", WoodDeckSF=274, LotArea=11060)
predict.lm(ForwardMod, newpredictiondata, interval="prediction", level=.95)
prediction from a rank-deficient fit may be misleading
fit lwr upr
1 234.182 178.715 289.6491
With a 2 story house from Ames, Iowa, with a good exterior quality, excellent basement height, average overall condition, built in 1995, basement square footage of 1150 ft, average kitchen quality, space for 2 cars in the garage, unfinished basement, 9 total rooms, 2314 ft in living area square feet, 2 full baths, 274 sq ft of wood deck, and 11060 sq ft lot area, we expect the price to be $234,182. We are 95% confident that the price will fall between $178,715 and $289,649.